Bots and downloads
Posted by TedH (TedH), 1 January 2009Happy New Year Graham.
General question really. I have downloads on some of my sites (zipped files and pdfs) and was wondering if bots download these too (like the GoogleBot etc)?
If so, then I am getting false readings as to the number of downloads and will have to figure out a way around this so only "people" download the files.
Any ideas appreciated.
Posted by admin (Graham Ellis), 1 January 2009on 01/01/09 at 19:07:42, TedH wrote:
Yes, Ted ... I'm pretty sure that the search engines do download them ... for Google will offer you a .pdf document sometimes, and it can only done that if it has grabbed and analsyed the thing.
I suspect that you DO want to have the various "bots" grab copies of the files to give you good search engine placement, and that getting the count of downloads wrong is a very small price to pay. However, you could tell the "bots" to avoid the file using the robots exclusion protocol. Funnily enough, I wrote quite a long blog which included this just a few hours ago - currently it't the top page at The Horse's Mouth - or for visitors coming here later, there's a permanent link here
The only other 'trick' I might suggest you play would be to feed out the downloads via a script, and have that script look at the user_agent and count only those which are non-robotic. Sound like a lot of work if all you want to do is count robots out though!
Posted by TedH (TedH), 2 January 2009Hi Graham, thanks - I'm okay with the pdfs being picked up. I don't know what a bot would do with a zipped file.
I could have a script that is just for the zipped files and exclude the bot from them, which I guess would be best. A script handles the count of the files.
Or a simple captcha might do. I'll play around with some stuff and see.
Yes it is after 3 am - can't sleep. Flu - the yukky kind.
Cheers - Ted
PH: 01144 1225 708225 • FAX: 01144 1225 899360 • EMAIL: email@example.com • WEB: http://www.wellho.net • SKYPE: wellho