Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
 
For 2021 - online Python 3 training - see ((here)).

Our plans were to retire in summer 2020 and see the world, but Coronavirus has lead us into a lot of lockdown programming in Python 3 and PHP 7.
We can now offer tailored online training - small groups, real tutors - works really well for groups of 4 to 14 delegates. Anywhere in the world; course language English.

Please ask about private 'maintenance' training for Python 2, Tcl, Perl, PHP, Lua, etc.
Web site traffic - real users, or just noise?

It's been said that on some web sites these days, the majority of traffic isn't users at a regular browser at all; instead, it's robots that are indexing the page (such as Google, Yahoo, MSN, Yandex and others), and malware that's looking for holes through which to inject content on to other people's pages, or to copy and spread itself through sites which have left some security gatyes open. Now - we welcome the indexing crawlers, and we take steps to ensure that malware is ineffective, but when it comes down to it we really DO want a significant proportion of our traffic to be real people visiting our site! But how can we tell?

We collect a daily access log file; these days, it can be up to 45 Mbytes of log information per day, and that's far too much information to read through line by line. In any case, judgments on some of the lines would be "that is probably a genuine user" or "that looks rather fishy", which are hardly certainties on which to base a judgment. However, this graph, showing the size of the log file on a day by day basis gives us a very good clue. As I write (December, 2009), there's a 7 day cycle, with the log files on a busy day reaching the 45 Mbytes mark, and on a quiet day being around 25 Mbytes. This pattern has been long since established - indeed, I commenton on in in June 2007.

Looking at the difference - 45 Mb to 25 Mb - persuaded me that at least 20 Mb of our weekday traffic was "actual people" browsing, and in fact I decided that was a very pessimistic estimate. More and more, visitors to our site are using the technologies I write about for leisure activities, so will be arriving on our site at the weekend rather than during the week, and the noticeable dip on Fridays is, I'm sure, partly caused by the fact that Friday is a Holy day in many countries, from where people will return on Saturday and Sunday (Friday is also P.O.E.T.S. day (see Acronyms)). But just how much of that 25 Mb is actual people?


There's a clue here in this current graph, dated 26th December 2009. [This one won't change, but the one at the top of the page will continue to update daily!] On Christmas day, the log file size dropped to just over 15 Mbytes; the server was functioning correctly (so there's no reason for a blip there), but it *was* Christmas day. So I can now be more optimistic yet about the number of "actual people" browsing - suggesting that there's up to 30 Mbytes of traffic from such users on a busy day, with only a third of the traffic being robotic / malware.

Looking further still, there was still *some* genuine traffic in that 15 Mbytes on Christmas day. I took a look at our most popular search engine arrival page, and found that some 197 people had been referred to us (as against a peak of around 980), and that one particular image called up by regular users was referenced 1600 times rather that 4800 times two weeks previously. To that tells that even in the 15 Mb, we had around a quarter of our regular real traffic - in round terms, between 7 Mb and 8 Mb of log file. Which - very roughly - tells me that the automata that are running 24 x 7 account for only 6 Mb to 9 Mb of our normal traffic.

So - that's an estimate of just 16% to 20% of weekday traffic, and 24% to 36% of our weekend traffic, being the 24 x 7 background noise, with the substantial majority being the traffic at which we target the web site. I'm happy with these stats, having seen figures of up to 80% "noise" being quoted. We MIGHT have exceeded the 50% figure on just one day - Christmas day - but that's been more than worthwhile; it was our "almost off" day, and it give me valuable data against which to analyse our site records.


RSS feeds and Ajax form only a very small part of the traffic from our web site, and I have discounted them from consideration above. But if you use similar techniques / logic to me, you need to think carefully and understand your base data before coming to conclusions.
(written 2009-12-26, updated 2010-01-06)

 
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
W501 - Introduction to Web Site Structure
  [332] Looking up IP addresses - (2005-06-01)
  [528] Getting favicon to work - avoiding common pitfalls - (2005-12-14)
  [1024] Web site - a refresh to improve navigation - (2007-01-07)
  [1031] robots.txt - a clue to hidden pages? - (2007-01-13)
  [1168] Moving out some of the web site bloat - (2007-04-29)
  [1176] A pu that got me into trouble - (2007-05-04)
  [1198] From Web to Web 2 - (2007-05-21)
  [1431] Getting the community on line - some basics - (2007-11-13)
  [1636] What to do if the Home Page is missing - (2008-05-08)
  [1686] FTP - how not to corrupt data (binary v ascii) - (2008-06-24)
  [1969] Search Engines. Getting the right pages seen. - (2009-01-01)
  [2094] If you have a spelling mistake in your URL / page name - (2009-03-21)
  [2214] Global Index to help you find resources - (2009-06-01)
  [2282] Checking robots.txt from Python - (2009-07-12)

G911 - Well House Consultants - Search Engine Optimisation
  [165] Implementing an effective site search engine - (2005-01-01)
  [427] The Melksham train - a button is pushed - (2005-08-28)
  [1015] Search engine placement - long term strategy and success - (2006-12-30)
  [1029] Our search engine placement is dropping. - (2007-01-11)
  [1344] Catching up on indexing our resources - (2007-09-10)
  [1793] Which country does a search engine think you are located in? - (2008-09-11)
  [1971] Telling Google which country your business trades in - (2009-01-02)
  [1982] Cooking bodies and URLs - (2009-01-08)
  [1984] Site24x7 prowls uninvited - (2009-01-10)
  [2000] 2000th article - Remember the background and basics - (2009-01-18)
  [2019] Baby Caleb and Fortune City in your web logs? - (2009-01-31)
  [2045] Does robots.txt actually work? - (2009-02-16)
  [2065] Static mirroring through HTTrack, wget and others - (2009-03-03)
  [2106] Learning to Twitter / what is Twitter? - (2009-03-28)
  [2107] How to tweet automatically from a blog - (2009-03-28)
  [2137] Reaching the right people with your web site - (2009-04-23)
  [2324] What search terms FAIL to bring visitors to our site, when they should? - (2009-08-05)
  [2330] Update - Automatic feeds to Twitter - (2009-08-09)
  [2428] Diluting History - (2009-09-27)
  [2562] Tuning the web site for sailing on through this year - (2010-01-03)
  [2686] Freedom of Information - consideration for web site designers - (2010-03-20)
  [2748] Monitoring the success and traffic of your web site - (2010-05-01)
  [3670] Reading Google Analytics results, based on the relative populations of countries - (2012-03-24)
  [3746] Google Analytics and the new UK Cookie law - (2012-06-02)
  [4121] Has your Twitter feed stopped working? Switching to their new API - (2013-06-23)

G902 - Well House Consultants - Web site techniques, utility and visibility
  [23] Skills and responsibilities - (2004-08-22)
  [32] Web design platoon - (2004-08-29)
  [98] No more 'Error 404' pages. Something better. - (2004-10-24)
  [109] URLs - a service and not a hurdle - (2004-11-04)
  [117] A case of case - (2004-11-14)
  [142] Colour for access - (2004-12-06)
  [173] Data Mining - (2005-01-09)
  [179] The hunt for unique words - (2005-01-16)
  [182] Your personal Google ranking - (2005-01-19)
  [197] Allow for peak traffic on your web site - (2005-02-01)
  [202] Searching for numbers - (2005-02-04)
  [222] Who are all these visitors? - (2005-02-20)
  [259] Responding to spam - (2005-03-27)
  [261] Putting a form online - (2005-03-29)
  [268] Information request forms, cleaning up spam - (2005-04-05)
  [274] Our most popular resources - (2005-04-10)
  [276] An apology to Mr Boneparte - (2005-04-11)
  [278] Cover all the options - (2005-04-13)
  [284] The Iconish language - (2005-04-19)
  [288] Colour blindness for web developers - (2005-04-22)
  [311] Growth pains - (2005-05-14)
  [314] What language is this written in? - (2005-05-17)
  [320] Ordnance Survey - using a 'Get a map' - (2005-05-22)
  [322] More maps - (2005-05-23)
  [347] Frightening and from-friend viruses and spams - (2005-06-14)
  [348] Graveyard pages - (2005-06-15)
  [369] CMS - the minefield of Choices - (2005-07-05)
  [376] What brings people to my web site? - (2005-07-13)
  [414] Form Madness - (2005-08-14)
  [492] New Navigation Aid - Launch of My Wellho - (2005-11-11)
  [510] Dynamic Web presence - next generation web site - (2005-11-29)
  [533] Bigger Box Campaign - (2005-12-18)
  [649] Denial of Service ''attack'' - (2006-03-17)
  [658] Keeping the visitors happy and browsing - (2006-03-26)
  [681] Mirroring a dynamic site - (2006-04-12)
  [718] Protecting images from theft - (2006-05-12)
  [732] Where is a web site visitor browsing from - (2006-05-24)
  [757] Horse and Python training - (2006-06-12)
  [767] Finding the language preference of a web site visitor - (2006-06-18)
  [800] Effective web campaign? - (2006-07-12)
  [893] Visibility - (2006-10-14)
  [916] Driving customers away - (2006-11-07)
  [976] Santa at the station - (2006-12-09)
  [994] Training on Cascading Style Sheets - (2006-12-17)
  [1055] Above the fold - (2007-01-28)
  [1104] Drawing dynamic graphs in PHP - (2007-03-09)
  [1177] Sorting out for a site map - (2007-05-05)
  [1184] Finding resources - some pointers - (2007-05-13)
  [1186] Two new pages / sites - (2007-05-14)
  [1207] Simple but effective use of mod_rewrite (Apache httpd) - (2007-05-27)
  [1212] What brought YOU to our web site? - (2007-06-01)
  [1237] What proportion of our web traffic is robots? - (2007-06-19)
  [1297] Stuffing content into a web page - easy maintainance - (2007-08-09)
  [1437] Above the fold with First Great Western - (2007-11-19)
  [1494] A time to update pictures - (2008-01-03)
  [1505] Script to present commonly used images - PHP - (2008-01-13)
  [1506] Ongoing Image Copyright Issues, PHP and MySQL solutions - (2008-01-14)
  [1513] Perl, PHP or Python? No - Perl AND PHP AND Python! - (2008-01-20)
  [1534] Where in the world / country is my visitor from? - (2008-02-07)
  [1541] Colour, Composition or Content - (2008-02-16)
  [1554] Online hotel reservations - Melksham, Wiltshire (near Bath) - (2008-02-24)
  [1610] PHP course dot co, dot uk - (2008-04-13)
  [1630] To provide external links, or not? - (2008-05-04)
  [1634] Kiss and Book - (2008-05-07)
  [1653] How do Google Ads work? - (2008-05-25)
  [1711] Rapid growth leads to server move - (2008-07-17)
  [1747] Who is watching you? - (2008-08-10)
  [1756] Ever had One of THOSE mornings? - (2008-08-16)
  [1797] I have been working hard but I do not expect you noticed - (2008-09-14)
  [1833] Web Bloopers - good form design - avoiding pitfalls - (2008-10-11)
  [1856] A few of my favourite things - (2008-10-26)
  [1888] Find the link - (2008-11-16)
  [1955] How to avoid duplicating web page maintainance - (2008-12-20)
  [1961] Making our things easier to find - (2008-12-26)
  [1970] Plagarism - who is copying my pages? - (2009-01-02)
  [2056] Web Site Loading - experiences and some solutions shared - (2009-02-26)
  [2225] How important is a front page ranking on a search engine? - (2009-06-09)
  [2332] Formation, des langages Open Source - (2009-08-09)
  [2333] Formaci[83][c2]ón, de los lenguajes de c[83][c2]ódigo abierto - (2009-08-09)
  [2334] Formazione, Open Source computer lingue - (2009-08-09)
  [2335] Ausbildung, die Open-Source-Sprachen - (2009-08-09)
  [2336] Forma[83][c2]ç[83][c2]ão, Open Source computador l[83][c2]ínguas - (2009-08-09)
  [2337] Opleiding, Open Source computertalen - (2009-08-09)
  [2338] Uddannelse, Open Source computer sprog - (2009-08-09)
  [2339] Oppl[83][c2]æring, Open Source datamaskinen spr[83][c2]åk - (2009-08-09)
  [2340] ldning, Open Source dator spr[83][c2]åk - (2009-08-09)
  [2341] Koulutus, Open Source tietokone kielill[83][c2]ä - (2009-08-09)
  [2389] Writing with our customers words - (2009-09-01)
  [2410] Removal of technical resources from this site - (2009-09-19)
  [2519] Status Page / breaks of service in early December - (2009-11-30)
  [2532] Analysing Google arrivals by country of origin - (2009-12-10)
  [2569] How to run a successful online poll / petition / survey / consultation - (2010-01-10)
  [2668] Is it worth it? - (2010-03-09)
  [2981] How to set up short and meaningfull alternative URLs - (2010-10-02)
  [3022] Retaining web site visitors - reducing the one page wonders - (2010-10-31)
  [3087] Making the most of critical emails - reading behind the scene - (2010-12-16)
  [3149] Looking back at www.wellho.net - (2011-01-28)
  [3197] Finding and diverting image requests from rogue domains - (2011-03-08)
  [3367] Google +1 - what is it? - (2011-07-22)
  [3426] Automed web site testing scripted in Ruby using watir-webdriver - (2011-09-09)
  [3491] Who is knocking at your web site door? Are you well set up to deal with allcomers? - (2011-10-21)
  [3532] Sharing the user experience - designing a form with the customer in mind - (2011-11-29)
  [3554] Learning more about our web site - and learning how to learn about yours - (2011-12-17)
  [3563] How big is a web page these days? Does the size of your pages matter? - (2011-12-26)
  [3589] Promoting a single one of your domains on the search engines - (2012-01-22)
  [3623] Some TestWise examples - helping use Ruby code to check your web site operation - (2012-02-24)
  [3734] QR codes with marketing logos embedded - (2012-05-16)
  [3744] Short Web Addresses for Melksham - (2012-05-30)
  [3745] Legal change - You need to obtain user consent if you use cookies on your website - (2012-06-01)
  [3776] Some traps it's so easy to fall into in designing your web site - (2012-06-23)
  [3896] An email marathon - (2012-10-15)
  [3974] TV show appearance - how does it effect your web site? - (2013-01-13)
  [4001] Helping search engines with appropriate 400 error codes - (2013-02-11)
  [4076] Web site - fully back! - (2013-04-29)
  [4115] More or less back - what happened to our server the other day - (2013-06-14)
  [4136] How do I post automatically from a PHP script to my Twitter account? - (2013-07-10)
  [4239] Facebook marketing - early experiences - (2014-01-19)
  [4376] Well House Consultants, Well House Manor, First Great Western Coffee shop, TransWilts / 2014 web site reports - (2015-01-01)
  [4401] Selecting RECENT and POPULAR news and trends for your web site users - (2015-01-19)
  [4474] Effect on external factors on traffic to our web sites - an update - (2015-04-26)
  [4492] Almost so wrong, but perhaps it's right for some? - (2015-05-11)


Back to
Perl and the Common Gateway Interface - out of fashion but still very useful?
Previous and next
or
Horse's mouth home
Forward to
On a short walk from home
Some other Articles
Railway Station Survey - please complete today or tomorrow
Bookkeeping
Adding retrospective ALT attributes to IMG
On a short walk from home
Web site traffic - real users, or just noise?
Perl and the Common Gateway Interface - out of fashion but still very useful?
Do not copy and paste code - there are much better ways
Christmas Day ...
Ten years ago, we moved to Melksham Spa
The great thing about snow ....
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2021: 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/2552_Web ... oise-.html • PAGE BUILT: Sun Oct 11 16:07:41 2020 • BUILD SYSTEM: JelliaJamb