Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
 
For 2023 (and 2024 ...) - we are now fully retired from IT training.
We have made many, many friends over 25 years of teaching about Python, Tcl, Perl, PHP, Lua, Java, C and C++ - and MySQL, Linux and Solaris/SunOS too. Our training notes are now very much out of date, but due to upward compatability most of our examples remain operational and even relevant ad you are welcome to make us if them "as seen" and at your own risk.

Lisa and I (Graham) now live in what was our training centre in Melksham - happy to meet with former delegates here - but do check ahead before coming round. We are far from inactive - rather, enjoying the times that we are retired but still healthy enough in mind and body to be active!

I am also active in many other area and still look after a lot of web sites - you can find an index ((here))
Using Perl to generate multiple reports from a HUGE file, efficiently

If you want to extract two distinct reports from a large data source, there are a number of ways you could do it. The first two are not brilliant:

1. You could read the entire file into memory, and then traverse it several times in a loop. This is a poor solution if the data becomes huge - the footprint of the program becomes massive, it may start swapping on and off the disc, and indeed it may crash "out of memory".

2. You could read the file multiple times. This is hard going on the disc, and potentially very slow as disc access times can be significant.

The third solution, which I describe fully below, is MUCH better ... you can read your data in record by record, just once and store the data you need for each report into separate variables as you go along. There's a new example from this week's Perl course [here].

Processing a web access log file (30 Mb but could be far bigger!) line by line:

  while ($line = ) {
    @parts = split(/\s+/,$line);


I built up strings with the extracted data that I needed for huge URL reads
    if ($parts[9] > 1000000) {
      $huge .= "$parts[3] $parts[8] $parts[9] $parts[6]\n";
    }


and for requests that generated server errors
    if ($parts[8] >= 500) {
      $server .= "$parts[3] $parts[8] $parts[6]\n";
    }


all within the same read loop - here's the end of the while loop:

  }

Then - after the file reading was completed - I printed out the results:

  print "$huge\n";
  print "$server\n";





That same example has been expanded ... into a third report. I can (of course) add as many reports as I like to this, but in this third case I've used a list instead to collect the data I need within the same while loop that reads the whole file:

    if ($line =~ /Trowbridge/) {
      push @toon,"$parts[6] $parts[9]\n";
    }


This has then allowed me to reorder (sort) the report data before sending it to the output:

  @toon = sort(@toon);
  print "Trowbridge, by page name:\n@toon\n";


Example written on this week's Learning to program in Perl course.

(written 2011-12-09, updated 2011-12-17)

 
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
P205 - Perl - Initial String Handling
  [31] Here documents - (2004-08-28)
  [254] x operator in Perl - (2005-03-22)
  [324] The backtick operator in Python and Perl - (2005-05-25)
  [970] String duplication - x in Perl, * in Python and Ruby - (2006-12-07)
  [987] Ruby v Perl - interpollating variables - (2006-12-15)
  [1195] Regular Express Primer - (2007-05-20)
  [1608] Underlining in Perl and Python - the x and * operator in use - (2008-04-12)
  [1849] String matching in Perl with Regular Expressions - (2008-10-20)
  [1860] Seven new intermediate Perl examples - (2008-10-30)
  [2798] Perl - skip the classics and use regular expressions - (2010-06-08)
  [2816] Intelligent Matching in Perl - (2010-06-18)
  [2832] Are you learning Perl? Some more examples for you! - (2010-06-27)
  [2963] Removing the new line with chop or chomp in Perl - what is the difference? - (2010-09-21)
  [3005] Lots of ways of doing it in Perl - printing out answers - (2010-10-19)
  [3411] Single and double quotes strings in Perl - what is the difference? - (2011-08-30)
  [3548] Dark mornings, dog update, and Python and Lua courses before Christmas - (2011-12-10)
  [3770] Sample answers to training course exercises - available on our web site - (2012-06-21)


Back to
The difference between dot (a.k.a. full stop, period) and comma in Perl
Previous and next
or
Horse's mouth home
Forward to
Dark mornings, dog update, and Python and Lua courses before Christmas
Some other Articles
Some terms used in programming (Biased towards Python)
Provide a useable train service, and people will use it!
Well House Manor - perhaps the best hotel rooms in Melksham
Using Perl to generate multiple reports from a HUGE file, efficiently
The difference between dot (a.k.a. full stop, period) and comma in Perl
Finding all matches to a pattern in Perl regular expressions
Looking for hotel rooms in Melksham over Christmas? We still have some availability
Some different pictures from Melksham
What order are operations performed in, in a Perl expression?
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2024: 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/3547_Usi ... ently.html • PAGE BUILT: Sun Oct 11 16:07:41 2020 • BUILD SYSTEM: JelliaJamb