Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
 
For 2023 (and 2024 ...) - we are now fully retired from IT training.
We have made many, many friends over 25 years of teaching about Python, Tcl, Perl, PHP, Lua, Java, C and C++ - and MySQL, Linux and Solaris/SunOS too. Our training notes are now very much out of date, but due to upward compatability most of our examples remain operational and even relevant ad you are welcome to make us if them "as seen" and at your own risk.

Lisa and I (Graham) now live in what was our training centre in Melksham - happy to meet with former delegates here - but do check ahead before coming round. We are far from inactive - rather, enjoying the times that we are retired but still healthy enough in mind and body to be active!

I am also active in many other area and still look after a lot of web sites - you can find an index ((here))
Does robots.txt actually work?

If you put an entry into your robots.txt file to ask the various robots to disallow (cease crawling) certain files and directories, do they actually take note of your request ... considering that it's a purely voluntary standard ...

Three or four days back, I excluded some old map pages which were being heavily crawled and I've just visited my log files for the last fortnight:

-bash-3.2$ egrep -c 'net/+map' ac_200902*
ac_20090201:8779
ac_20090202:7884
ac_20090203:15697
ac_20090204:9284
ac_20090205:4944
ac_20090206:9640
ac_20090207:10299
ac_20090208:7015
ac_20090209:5534
ac_20090210:4188
ac_20090211:6808
ac_20090212:853
ac_20090213:1669
ac_20090214:74
ac_20090215:76


Yes! - it has worked. Accesses to these pages - which were predominantly crawlers - has dropped from some 8,000 to 10,000 per day down to less than a hundred - and I suspect that most of those are genuine hits!

You'll find more about robots.txt here
(written 2009-02-16, updated 2009-02-17)

 
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
P608 - Perl - Robots, Crawlers and Spiders
  [1031] robots.txt - a clue to hidden pages? - (2007-01-13)
  [2229] Do not re-invent the wheel - use a Perl module - (2009-06-11)
  [2402] Automated Browsing in Perl - (2009-09-11)

G911 - Well House Consultants - Search Engine Optimisation
  [165] Implementing an effective site search engine - (2005-01-01)
  [427] The Melksham train - a button is pushed - (2005-08-28)
  [1015] Search engine placement - long term strategy and success - (2006-12-30)
  [1029] Our search engine placement is dropping. - (2007-01-11)
  [1344] Catching up on indexing our resources - (2007-09-10)
  [1793] Which country does a search engine think you are located in? - (2008-09-11)
  [1969] Search Engines. Getting the right pages seen. - (2009-01-01)
  [1971] Telling Google which country your business trades in - (2009-01-02)
  [1982] Cooking bodies and URLs - (2009-01-08)
  [1984] Site24x7 prowls uninvited - (2009-01-10)
  [2000] 2000th article - Remember the background and basics - (2009-01-18)
  [2019] Baby Caleb and Fortune City in your web logs? - (2009-01-31)
  [2065] Static mirroring through HTTrack, wget and others - (2009-03-03)
  [2106] Learning to Twitter / what is Twitter? - (2009-03-28)
  [2107] How to tweet automatically from a blog - (2009-03-28)
  [2137] Reaching the right people with your web site - (2009-04-23)
  [2324] What search terms FAIL to bring visitors to our site, when they should? - (2009-08-05)
  [2330] Update - Automatic feeds to Twitter - (2009-08-09)
  [2428] Diluting History - (2009-09-27)
  [2552] Web site traffic - real users, or just noise? - (2009-12-26)
  [2562] Tuning the web site for sailing on through this year - (2010-01-03)
  [2686] Freedom of Information - consideration for web site designers - (2010-03-20)
  [2748] Monitoring the success and traffic of your web site - (2010-05-01)
  [3670] Reading Google Analytics results, based on the relative populations of countries - (2012-03-24)
  [3746] Google Analytics and the new UK Cookie law - (2012-06-02)
  [4121] Has your Twitter feed stopped working? Switching to their new API - (2013-06-23)


Back to
Please Trouble me
Previous and next
or
Horse's mouth home
Forward to
Finding variations on a surname
Some other Articles
Why Choose Well House Consultants for your course?
Learning to program in PHP, Python, Java or Lua ...
Small Web Server in Perl
Finding variations on a surname
Does robots.txt actually work?
Please Trouble me
Confidence, Customer Service and Tourism in Melksham
Wiltshire Rail Service Update
httpd, Tomcat and PHP course enhancements
Error: Cant read xxxxx: no such variable (in Tcl Tk)
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2024: 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/2045_Doe ... work-.html • PAGE BUILT: Sun Oct 11 16:07:41 2020 • BUILD SYSTEM: JelliaJamb