Scenario. I have a lot of data that contains large numbers of records which I want to separate into groups. For example, an incomeing web server log file which I want to split out and process visitor by visitor.
Using Perl, I can loop through my data line by line and store it into a hash - for example:
while ($lyne = H>) {
$lyne =~ /\S+/ ;
$all{$&} .= $lyne;
}
• In a web server log file, the IP address / name of the visiting server is the first non-space string on the line
• It's perfectly valid to do a regular expression match outside a condition - if you're working with an automatically generated data file that does not need any validation, this is acceptable practise too
• $& is the special variable that contains "the bit that matched" after a regular expression match in Perl. If the incoming string is massive and there are lots of matches in a tight loop, you *may* be a bit inefficient if you use $&.
• The
.= operator adds on to the end of an existing string. If I had wanted a list of accesses (rather than a string contaning them all), I could have pushed each recrd onto a list within the hash (but that would be at a later point in the course).
• Implicit reference to a variable such as the hash %all in my example will cause it to be created if it doesn't exist (the very first time through the loop), and each new element in that hash will similarly be created as necessary. In a longer program, creating of a local hash via
my %all may be appropriate.
Using the code above, I then output each of the members of the hash, so grouping records by visiting client, and within visiting client by date and time since that's the order that are stored in the original file:
foreach $visitor(keys %first) {
print $all{$visitor},"\n";
}
• Note the extra \n. In Perl, you always need to think about your new lines. In this example, they're present on the records when read in, they are
notremoved with
chop or
chmomp, so they are kept within the $all string as record delimiters. I've added the extra one in the output code just to provide a degree of separation between the blocks.
The complete program that the snippets above are copied from is on our web site -
[here].
(written 2012-08-15, updated 2012-08-18)
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
P215 - Perl - More about Files [1225] Perl - functions for directory handling - (2007-06-09)
[1709] There is more that one way - Perl - (2008-07-14)
[1832] Processing all files in a directory - Perl - (2008-10-11)
[2405] But I am reading from a file - no need to prompt (Perl) - (2009-09-14)
[2964] An introduction to file handling in programs - buffering, standard in and out, and file handles - (2010-09-21)
[3320] Reading the nth line from a file (Perl and Tcl examples) - (2011-06-09)
[3412] Handling binary data in Perl is easy! - (2011-08-30)
P207 - Perl - File Handling [12] How many people in a room? - (2004-08-12)
[114] Relative or absolute milkman - (2004-11-10)
[255] STDIN, STDOUT, STDERR and DATA - Perl file handles - (2005-03-23)
[616] printf - a flawed but useful function - (2006-02-22)
[618] Perl - its up to YOU to check your file opened - (2006-02-23)
[702] Iterators - expressions tha change each time you call them - (2006-04-27)
[867] Being sure to be positive in Perl - (2006-09-15)
[1312] Some one line Perl tips and techniques - (2007-08-21)
[1416] Good, steady, simple example - Perl file handling - (2007-10-30)
[1442] Reading a file multiple times - file pointers - (2007-11-23)
[1467] stdout v stderr (Tcl, Perl, Shell) - (2007-12-10)
[1841] Formatting with a leading + / Lua and Perl - (2008-10-15)
[1860] Seven new intermediate Perl examples - (2008-10-30)
[1861] Reactive (dynamic) formatting in Perl - (2008-10-31)
[2233] Transforming data in Perl using lists of lists and hashes of hashes - (2009-06-12)
[2818] File open and read in Perl - modernisation - (2010-06-19)
[2821] Chancellor George Osborne inspires Perl Program - (2010-06-22)
[2833] Fresh Perl Teaching Examples - part 2 of 3 - (2010-06-27)
[3326] Finding your big files in Perl - design considerations beyond the course environment - (2011-06-14)
[3548] Dark mornings, dog update, and Python and Lua courses before Christmas - (2011-12-10)
[3830] Traversing a directory in Perl - (2012-08-08)
Some other Articles
Caching Design PatternsRelax at Well House Manor - gardens, fountain, hotelCopying, duplicating, cloning an object in PHPAutoload in PHPSpraying data from one incoming to series of outgoing files in PerlGuest review - Well House Manor, MelkshamEvening behind Melksham SpaIn the garden at Well House ManorThe Information age - not yet truly with us?Geekmas 2012 - celebrating open source languages such as Perl, PHP and Python