| ||||
Sample Application - Read and analyse a web access log file
this example from a Well House Consultants training course
Source code: latest_log Module: R050
about = <<"DESCRIPT" Find the most recent log file in the current directory and analyse it. Reject OPTIONS lines, and create objects of type Get, Post and Other which will be stored in an array for each different IP address which are stored in a hash Log files on our system are named like ac_20090714 DESCRIPT # Change next line to "debugging = 1" to turn on extra prints! debugging = nil #------------------------------------------------------------- # Sample record: # 139.95.251.122 - - [14/Jul/2009:03:30:02 +0100] "GET /resources/ex.php4? # item=y115/penv.py HTTP/1.1" 200 18436 "http://www.google.com/search?client= # safari&rls=en-us&q=python+environment+variable&ie=UTF-8&oe=UTF-8" "Mozilla/5.0 # (Macintosh; U; Intel Mac OS X 10_5_7; en-us) AppleWebKit/525.28.3 (KHTML, # like Gecko) Version/3.2.3 Safari/525.28.3" class Access def initialize(logline) @raw = logline # Lots of time stuff being done, so precalculate and store in object /:((\d\d):(\d\d):(\d\d))/ =~ @raw @timestamp = $1 @seconds = ($2.to_i * 60 + $3.to_i) * 60 + $4.to_i end def ip # Find first field (space separated) in record /^\S+/ =~ @raw return $& end def gettime return @timestamp end def getseconds return @seconds end def timediff(other) sv = other.getseconds - self.getseconds # Correction for visitors who crossed midnight if sv < 0 sv += 3600 *24 end return sv end def isabot if /Googlebot|Slurp|dummy/ =~ @raw return 1 else return 0 end end end # --------- Find the most recent log file # Open the current directory direct = Dir.new(".") recent = 0 # While there are file names to read ... while item = direct.read # We're only interested in file with names starting "ac" # Capture the date into $1 next unless item =~ /^ac_(........)$/ print "Log file #{$1} checked\n" if debugging date = $1.to_i recent = date if date > recent end # Most recent file ... source = "ac_"+recent.to_s print "Most recent file is #{source}\n" if debugging # --------- Set up our storage structures acc = {} # --------- Process the most recent log file fh = File.new source # One record at a time as file may be MASSIVE! while record = fh.gets # Create an Access object from the record # Get its ip and add it to the acc hash current_record = Access.new record remote_host = current_record.ip # If this is the first visit from this IP, add an array to the hash if acc[remote_host] == nil acc[remote_host] = [current_record] else # If we have had previous visits from the same IP, add to array acc[remote_host].push current_record end end # ------------- Data has been read! # First test analysis! r_h = acc.keys print "There were #{r_h.length} different visitors on #{recent}\n" for visitor in r_h # Set "** " to be added if it is a robotic visitor robot = acc[visitor][0].isabot == 1 ? "** " : nil count = acc[visitor].length first = acc[visitor][0].gettime # Is this a humanoid calling multiple times? if count > 1 and ! robot more = " to " + acc[visitor][count-1].gettime elap = acc[visitor][0].timediff acc[visitor][count-1] elapsed = elap.to_s + " seconds" else more = nil elapsed = nil elap = 0 end # Look at real people who hung around from 1 to 5 minutes next if elap < 60 or elap > 300 print "#{robot}From #{visitor} there were #{count} " print "accesses from #{first} #{more} #{elapsed}\n" end __END__ Sample Output: From 123.239.212.8 there were 12 accesses from 19:13:14 to 19:15:27 133 seconds From 71.142.66.162 there were 9 accesses from 06:12:43 to 06:15:30 167 seconds From 193.46.86.228 there were 9 accesses from 16:14:21 to 16:17:04 163 seconds From 75.204.134.224 there were 10 accesses from 15:19:47 to 15:24:23 276 seconds From 63.88.28.220 there were 4 accesses from 17:28:51 to 17:32:01 190 seconds From 80.203.160.34 there were 7 accesses from 10:34:17 to 10:35:34 77 seconds From 24.107.186.96 there were 12 accesses from 05:59:00 to 06:00:22 82 seconds From 212.206.177.172 there were 6 accesses from 11:15:28 to 11:17:22 114 seconds From 117.197.122.151 there were 5 accesses from 16:25:19 to 16:26:21 62 seconds From 68.231.149.218 there were 11 accesses from 20:56:12 to 20:59:41 209 seconds From 119.152.23.228 there were 22 accesses from 10:45:33 to 10:48:21 168 seconds From 123.236.148.220 there were 8 accesses from 15:09:43 to 15:11:23 100 seconds Learn about this subject
This module and example are covered as required on private courses.
Should you wish to cover this example and associated subjects, and you're attending a public course
to cover other topics with us, please see our extra topic program.
Books covering this topic
Yes. We have over 700 books in our library. Books
covering Ruby are listed here and when you've selected a
relevant book we'll link you on to Amazon to order.
Other Examples
This example comes from our "this" training module. You'll find a description of the topic and some
other closely related examples on the "this" module index page.
Full description of the source code
You can learn more about this example on the training courses listed on this page,
on which you'll be given a full set of training notes.
Many other training modules are available for download (for limited use) from our download centre under an Open Training Notes License. Other resources
• Our Solutions centre provides a number of longer technical articles.
• Our Opentalk forum archive provides a question and answer centre. • The Horse's mouth provides a daily tip or thought. • Further resources are available via the resources centre. • All of these resources can be searched through through our search engine • And there's a global index here. Purpose of this website
This is a sample program, class demonstration or answer from a
training course. It's main purpose
is to provide an after-course service to customers who have attended our
public private or
on site courses, but the examples are made
generally available under conditions described below.
Web site author
Conditions of use
Past attendees on our training courses are welcome to use individual
examples in the course of their programming, but must check
the examples they use to ensure that they are suitable for their
job. Remember that some of our examples show you how not to do
things - check in your notes. Well House Consultants take no responsibility
for the suitability of these example programs to customer's needs.
This program is copyright Well House Consultants Ltd. You are forbidden from using it for running your own training courses without our prior written permission. See our page on courseware provision for more details. Any of our images within this code may NOT be reused on a public URL without our prior permission. For Bona Fide personal use, we will often grant you permission provided that you provide a link back. Commercial use on a website will incur a license fee for each image used - details on request. | ||||
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho PAGE: http://www.wellho.net/resources/ex.php • PAGE BUILT: Sun Oct 11 14:50:09 2020 • BUILD SYSTEM: JelliaJamb |