Exercises, examples and other material relating to training module P667. This topic is presented on public course
Perl for Larger Projects
Perl for Larger Projects - Objects, huge data, SQL databases, XML, efficiency and other topics.
This advanced course takes the Perl programmer through
... http://www.wellho.net/course/plfull.html [course] |
During courses, questions arise. "I'll get back to that" could make people feel that I'm brushing something off ... except that I explain, early on, ... http://www.wellho.net/mouth/975_Answ ... tions.html [short article] |
div class= introsubheads PROCESSING LARGE QUANTITIES OF DATA BR /div
Data Monging is a term that has come to be used for processing quantities of ... http://www.wellho.net/solutions/perl-data-monging.html [longer article] |
Searching for warrior
These are complete results
Searched through 3904403 sites, matched 794 started at Fri Sep 19 16:07:18 2003
url: http://www.boxofficemojo.com/13thwarrior.html
... http://www.wellho.net/resources/ex.php4?item=p667/out.txt [code sample] |
Have you ever sat there and wondered "is this program nearly done ... is it still running ... how is it getting on" and wished you had a progress bar. ... http://www.wellho.net/mouth/1920_Pro ... -Perl.html [short article] |
If you're handling a huge amount of data (gigabytes!) in a Perl program, memory won't allow you to slurp it all into a list and you'll traverse the data ... http://www.wellho.net/mouth/1397_Per ... flows.html [short article] |
When I'm programming a log file analysis in Perl, I'll often "slurp" the whole file into a list which I can then traverse efficiently as many times as ... http://www.wellho.net/mouth/762_Huge ... lier-.html [short article] |
If you've so much data that it won't all fit into memory all at once, you may not be able to use conventional programming techniques to complete your task. We define a data set such as this as "huge data"; it's impossible to handle in some languages, but very practical in Perl. This module doesn't introduce many new language features; instead, it shows you how to use what you already know to handle huge data practically.
This topic is presented on public course
Perl for Larger Projects
Examples from our training material
| behind | looking behind in huge data files |
| huge1 | A program to test handling a small part of a huge data set |
| huge2 | Providing user feedback while handling huge data |
| huge3 | Asking a long running application for intermediate reports |
| huge3.pid | Example of the huge.pid file |
| hugehunter | Long log file analysis, with progress and intermediate reporting |
| makedirs | Preprocessing a huge data file to set up indexes |
| makeindex | Generating a list of markers to a huge sorted data set |
| opt2 | Sorting and data filtering efficiency |
| opt3 | Improving sort efficiency |
| opt4 | Improving sort efficient further - caching record analysis |
| optim | Optimising code to avoid repeating calculations |
| out.txt | Example of search results written to file |
| paws | Progress Bar Techniques |
| readtime | Efficiency - reading a file in large blocks |
| reg_opt | Regular expression match - inefficient example |
| reg_opt1 | Regular expression match - don't save $' $` and $& |
| reg_opt2 | Regular expression match - use of "o" modifier |
| reg_opt3 | Regular expression match - more specific and faster |
| reg_opt4 | Regular expression match - a start assertion speeds it up! |
| rt2 | Handling data in chunks - chunk overlap issue solved |
| site.pm | Class used in other examples in this module |
| useindex | Grab first ten sites on a topic area - QUICKLY via index |
Pictures
A happy trainee
Background information
Some modules are
available for download as a sample of our material or under an
Open Training Notes License for free download from
http://www.training-notes.co.uk.
Topics covered in this module
Planning.
General techniques for large and huge data.
Code Optimization.
Regular Expressions.
Sorting.
Large Data.
Avoid loops.
Store data in memory.
Huge Data.
Hello HUGE world.
User feedback.
Controlling a long-running process.
Reading the data.
Arranging and storing the data.
Using a directory structure.
Indexing.
For Reference.
Complete learning
If you are looking for a complete course and not just a information on a single subject, visit our
Listing and schedule page.
Well House Consultants specialise in training courses in
Python,
Perl,
PHP, and
MySQL. We run
Private Courses throughout the UK (and beyond for longer courses), and
Public Courses at our training centre in Melksham, Wiltshire, England.
It's surprisingly cost effective to come on our public courses -
even if
you live in a different
country or continent to us.
We have a technical library of over 700 books on the subjects on which we teach.
These books are available for reference at our training centre. Also
available is the Opentalk
Forum for discussion of technical questions.