Training, Open Source computer languages
PerlPHPPythonMySQLApache / TomcatTclRubyJavaC and C++LinuxCSS 
Search for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
 
For 2023 (and 2024 ...) - we are now fully retired from IT training.
We have made many, many friends over 25 years of teaching about Python, Tcl, Perl, PHP, Lua, Java, C and C++ - and MySQL, Linux and Solaris/SunOS too. Our training notes are now very much out of date, but due to upward compatability most of our examples remain operational and even relevant ad you are welcome to make us if them "as seen" and at your own risk.

Lisa and I (Graham) now live in what was our training centre in Melksham - happy to meet with former delegates here - but do check ahead before coming round. We are far from inactive - rather, enjoying the times that we are retired but still healthy enough in mind and body to be active!

I am also active in many other area and still look after a lot of web sites - you can find an index ((here))
Bots and downloads

Posted by TedH (TedH), 1 January 2009
Happy New Year Graham.

General question really. I have downloads on some of my sites (zipped files and pdfs) and was wondering if bots download these too (like the GoogleBot etc)?

If so, then I am getting false readings as to the number of downloads and will have to figure out a way around this so only "people" download the files.

Any ideas appreciated.

Cheers,

Posted by admin (Graham Ellis), 1 January 2009
on 01/01/09 at 19:07:42, TedH wrote:
Happy New Year Graham.

General question really. I have downloads on some of my sites (zipped files and pdfs) and was wondering if bots download these too (like the GoogleBot etc)?

If so, then I am getting false readings as to the number of downloads and will have to figure out a way around this so only "people" download the files.

Any ideas appreciated.

Cheers,


Yes, Ted ... I'm pretty sure that the search engines do download them ... for Google will offer you a .pdf document sometimes, and it can only done that if it has grabbed and analsyed the thing.

I suspect that you DO want to have the various "bots" grab copies of the files to give you good search engine placement, and that getting the count of downloads wrong is a very small price to pay.  However, you could tell the "bots" to avoid the file using the robots exclusion protocol.   Funnily enough, I wrote quite a long blog which included this just a few hours ago - currently it't the top page at The Horse's Mouth - or for visitors coming here later, there's a permanent link here

The only other 'trick' I might suggest you play would be to feed out the downloads via a script, and have that script look at the user_agent and count only those which are non-robotic.   Sound like a lot of work if all you want to do is count robots out though!


Posted by TedH (TedH), 2 January 2009
Hi Graham, thanks - I'm okay with the pdfs being picked up. I don't know what a bot would do with a zipped file.

I could have a script that is just for the zipped files and exclude the bot from them, which I guess would be best. A script handles the count of the files.

Or a simple captcha might do. I'll play around with some stuff and see.

Yes it is after 3 am - can't sleep. Flu - the yukky kind.

Cheers - Ted



This page is a thread posted to the opentalk forum at www.opentalk.org.uk and archived here for reference. To jump to the archive index please follow this link.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2024: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho