Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
 
For 2021 - online Python 3 training - see ((here)).

Our plans were to retire in summer 2020 and see the world, but Coronavirus has lead us into a lot of lockdown programming in Python 3 and PHP 7.
We can now offer tailored online training - small groups, real tutors - works really well for groups of 4 to 14 delegates. Anywhere in the world; course language English.

Please ask about private 'maintenance' training for Python 2, Tcl, Perl, PHP, Lua, etc.
Loving programming in Python - and ready to teach YOU how

I like programming in Java, but I love programming in Python. It's been a real pleasure to get back to Python this morning. I'm teaching a private course in Cambridge this week, and a public python course the following week. And a new example as work my hand back in ...

Scenario - I require to read records from a whole folder of files and run a combined analysis of them. I'm looking at huge files - our server logs which are between 40Mbytes and 65Mbytes per day, and analysing a month or more of them at the same time.

I've written a class called dirStream, into the constructor of which I pass the folder name for the files. And I then loop through the data being returned by the stream, which (optionally) can be filtering for only records that match a paricular pattern. The example here has called another method to get the file name and line number in that file where the record was found:

  source = dirStream("logs")
  for record in source.getRecord(lookfor):
    file,line = source.getWhere()
    print line,file,record


As this is my test harness, I've then exercised the other methods I've provided - firstly for a brief report:

  report = source.getStatus()
  for k in report.keys():
    print "{0:<20s} {1}".format(k,report[k])



And then for a full report on the number of records and matches in each input file:

  for file_info in source.getReport():
    print "{1:8d} {2:8d} {0:s}".format(*file_info)


Let's see that in action, searching for "Salisbury" references for the last 3 weeks:

  python dirStream.py Salisbury
  
  stream_status        completed
  current_file_name    
  current_line_number  -1
  searching_for        Salisbury
  lines_read_so_far    4011213
  lines_matched_so_far 942
  total_number_files   21
  searching            yes
  current_file_number  21
  
  And the detailed output
  
  156228       27 logs/ac_20150201
  161144       55 logs/ac_20150202
  190542       22 logs/ac_20150203
  227646       44 logs/ac_20150204
  221454       67 logs/ac_20150205
  202896       45 logs/ac_20150206
  198114      104 logs/ac_20150207
  175836       56 logs/ac_20150208
  170156       34 logs/ac_20150209
  202743       62 logs/ac_20150210
  190289       52 logs/ac_20150211
  190397       56 logs/ac_20150212
  207429       44 logs/ac_20150213
  251313       31 logs/ac_20150214
  165796       25 logs/ac_20150215
  168314       13 logs/ac_20150216
  194138       65 logs/ac_20150217
  181487       15 logs/ac_20150218
  187665       65 logs/ac_20150219
  185631       31 logs/ac_20150220
  181995       29 logs/ac_20150221


The complete example's source code is available to you, with some comments and wrapped so that you can make use of it too for this common "parse all the records in all the files in a directory" requirement.

Of note to delegates / learners - interesting Python things:

• Use of generator within a method
• A constuctor that does more than just store incoming values
• A state holder (this.status_mode)
• Optional parameters
• Use of a dict to return a whole series of named status values
• use of "and" and "or" as a lazy "if" and "else"
• passing in multiple values to a format method using "*" to expand a list
• exception handling to cheaply pick up lack of command line selectors
• use of os.path.join to add in the appropriate file / folder separator character for the current OS
• conditional use of from to load extra code only if running the test programs
• A method that returns multiple values (a tuple)

I think I said at the start - I love programming in Python
(written 2015-02-22)

 
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
Y201 - Python for DataMunging and System Admin
  [3479] Practical Extraction and Reporting - using Python and Extreme Programming - (2011-10-14)
  [4088] Some tips and techniques for huge data handling in Python - (2013-05-15)
  [4211] Handling JSON in Python (and a csv, marshall and pickle comparison) - (2013-11-16)

Y200 - Python - using functions, objects and modules.
  [418] Difference between import and from in Python - (2005-08-18)
  [4719] Nesting decorators - (2016-11-02)

Y110 - Python - File Handling
  [114] Relative or absolute milkman - (2004-11-10)
  [183] The elegance of Python - (2005-01-19)
  [1442] Reading a file multiple times - file pointers - (2007-11-23)
  [2011] Conversion of OSI grid references to Eastings and Northings - (2009-01-28)
  [2282] Checking robots.txt from Python - (2009-07-12)
  [2870] Old prices - what would the equivalent price have been in 1966? - (2010-07-14)
  [3083] Python - fresh examples from recent courses - (2010-12-11)
  [3442] A demonstration of how many Python facilities work together - (2011-09-16)
  [3465] How can I do an FTP transfer in Python? - (2011-10-05)
  [3558] Python or Lua - which should I use / learn? - (2011-12-21)
  [3764] Shell, Awk, Perl of Python? - (2012-06-14)
  [4451] Running an operating system command from your Python program - the new way with the subprocess module - (2015-03-06)
  [4593] Command line parameter handling in Python via the argparse module - (2015-12-08)
  [4663] Easy data to object mapping (csv and Python) - (2016-03-24)
  [4708] Scons - a build system in Python - building hello world - (2016-10-29)
  [4717] with in Python - examples of use, and of defining your own context - (2016-11-02)


Back to
Adding a PHP build option, rotating an image based on camera data, and a new look at thumbnails in PHP
Previous and next
or
Horse's mouth home
Forward to
Json is the new marshall, pickle and cPickle / Python
Some other Articles
Mutable v Immuatble objects in Python, and the implication
Reading command line parameters in Python
A first graph with Matplotlib in Python
Json is the new marshall, pickle and cPickle / Python
Loving programming in Python - and ready to teach YOU how
Adding a PHP build option, rotating an image based on camera data, and a new look at thumbnails in PHP
Accessing a MySQL database from Python with mysql.connector
Images of our rail promotion campaign
Public training courses - upcoming dates
Different views of a Welsh Valley - but headed home
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2021: 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/4438_Lov ... U-how.html • PAGE BUILT: Sun Oct 11 16:07:41 2020 • BUILD SYSTEM: JelliaJamb