Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
For 2021 - online Python 3 training - see ((here)).

Our plans were to retire in summer 2020 and see the world, but Coronavirus has lead us into a lot of lockdown programming in Python 3 and PHP 7.
We can now offer tailored online training - small groups, real tutors - works really well for groups of 4 to 14 delegates. Anywhere in the world; course language English.

Please ask about private 'maintenance' training for Python 2, Tcl, Perl, PHP, Lua, etc.
Data extraction and reporting from multiple files
File and Directory Handling example from a Well House Consultants training course
More on File and Directory Handling [link]

This example is described in the following article(s):
   • Tcl - Some example of HOW TO in handling data files and formats - [link]

Source code: rail Module: T209

# Scenario. We have a number of (published, government) tables of the number of tickets sold for
# journeys to and from stations on the main railway network of Great Britain on an annual basis, but
# the format of the file for each year varies - fields have been added and removed, and any numbers
# over 999 have commas added to split up the numbers which is great for the human reader, but causes
# programs that want to further analyse the data some problems. We also have a file that include the
# location of stations in Ordnance Survey grid terms, postcode, latitude and longitude, etc.
# Our objective is to produce a single output file, one record per station, that includes fields for
# the station's details at the start, and then a total of the number of entrances and exits year by
# year.

# Define functions used in main program

# Add to a list, or add NULL if value does not exist

proc nappend {list value} {
        upvar $list target

        set there 0
        if {$value != ""} {
                set updo "info exists $value"
                catch {set there [uplevel $updo]} errst

        if {$there} {
                upvar $value vn
                lappend target $vn
        } else {
                lappend target NULL

# Sort by postocde, moving all unpostcoded records to the end

proc pcs {a b} {
        global postcode
        global tlctable

        set rv 0
        if {[catch {set at $postcode($tlctable($a))} errst]} {return 1}
        if {[catch {set bt $postcode($tlctable($b))} errst]} {return -1}

        return [string compare $at $bt]

# -------- Main Program

# Definition of fields of interest in varying different annual file formats

set filespec {
        {{station_usage_0910.csv 2009 23} {four 0} {tlc 1} {name 2} {thisyear 21}}
        {{station_usage_0809.csv 2008 23} {four 0} {tlc 1} {name 2} {thisyear 21}}
        {{station_usage_0708.csv 2007 23} {four 0} {tlc 1} {name 2} {thisyear 16}}
        {{station_usage_0607.csv 2006 23} {four 0} {tlc 1} {name 2} {thisyear 16}}
        {{station_usage_0506.csv 2005 16} {four 4} {tlc 1} {name 0} {thisyear 13}}
        {{station_usage_0405.csv 2004 16} {four 4} {tlc 1} {name 0} {thisyear 13}}

foreach year $filespec {

# read the format for the year

        set yeardata [lindex $year 0]
        set wanted [lrange $year 1 end]

        set filename [lindex $yeardata 0]
        set yearnumber [lindex $yeardata 1]
        set minfields [lindex $yeardata 2]

        set fh [open $filename r]

# read each record for the year

        while 1 {

                gets $fh record
                if {$record == ""} break
                set fields [split $record "\t"]

# Eliminate short (header and / or damaged) records

                if {[llength $fields] < $minfields} continue

# extract the fields that are wanted from the right columns for this year

                foreach pair $wanted {
                        set f_name [lindex $pair 0]
                        set f_col [lindex $pair 1]
                        set $f_name [lindex $fields $f_col]

# Substitute - decomma
                regsub -all , $thisyear "" thisyear
                regsub -all {"} $four "" four

# Extract - dequote
                regexp {"([[:alpha:]]{3})"} $tlc whole tlc

                regsub -all \" $name "" name
                if {[regexp ^....$ $four]} {
                        # if {$four == "3333"} {puts "$tlc $four $yearnumber $thisyear $name"}

# Save the passenger count for this station and year

                        set station_annual("$four/$yearnumber") $thisyear

# First time for this station - save name and three letter code
                        if {[info exists nametable($four)]} continue
                        set nametable($four) $name
                        set tlctable($four) $tlc

# --- At this point we have read in the data for each year
# Now read in the ancilliary data for each station

set fh [open stationtables.tsv r]
while {[gets $fh line] >= 0} {
        set fields [split $line "\t"]
        set tlc [lindex $fields 0]

        set osgrid($tlc) [lindex $fields end]
        set postcode($tlc) [lindex $fields 2]
        set lat($tlc) [lindex $fields 3]
        set long($tlc) [lindex $fields 4]

# Generate output file; get station order first

set fho [open railstats.txt w]

set tlcs [array names tlctable]
set tlcs [lsort -command pcs $tlcs]

foreach station $tlcs {
        set tlc $tlctable($station)

# Initial fields on each line

        set line ""
        nappend line station
        nappend line tlc
        nappend line postcode($tlc)
        nappend line osgrid($tlc)
        nappend line lat($tlc)
        nappend line long($tlc)
        nappend line nametable($station)

# Add annual data

        foreach yearnumber {2004 2005 2006 2007 2008 2009} {
                nappend line station_annual("$station/$yearnumber")

# Fields in output record:

# 1 - 4 digit station code
# 2 - Three letter code for station
# 3 - Station's postcode
# 4 - OS grid reference for station
# 5 - Latitude of station
# 6 - Longitude of station
# 7 - Station name
# 8 - Ticket sales - April 2004 - March 2005 for joureys to / from this station
# 9 - Ticker sales, April 2005 to March 2006
# 10 - Ticker sales, April 2006 to March 2007
# 11 - Ticker sales, April 2005 to March 2008
# 12 - Ticker sales, April 2008 to March 2009
# 13 - Ticker sales, April 2009 to March 2010
# 14 - April 2010 to March 2011 (added via Perl, 29.3.2012)

# Turn into tab delimited line

        set newline [join $line "\t"]

# Output the record provided that there are no problems with it

        if {[regexp "^...$" $tlc]} {
                puts $fho $newline
        } else {
                puts $newline


Learn about this subject
This module and example are covered on the following public courses:
 * Learning to program in Tcl
 * Tcl Programming
 * Tcl Programming
Also available on on site courses for larger groups

Books covering this topic
Yes. We have over 700 books in our library. Books covering Tcl, Tk and Expect are listed here and when you've selected a relevant book we'll link you on to Amazon to order.

Other Examples
This example comes from our "File and Directory Handling" training module. You'll find a description of the topic and some other closely related examples on the "File and Directory Handling" module index page.

Full description of the source code
You can learn more about this example on the training courses listed on this page, on which you'll be given a full set of training notes.

Many other training modules are available for download (for limited use) from our download centre under an Open Training Notes License.

Other resources
• Our Solutions centre provides a number of longer technical articles.
• Our Opentalk forum archive provides a question and answer centre.
The Horse's mouth provides a daily tip or thought.
• Further resources are available via the resources centre.
• All of these resources can be searched through through our search engine
• And there's a global index here.

Purpose of this website
This is a sample program, class demonstration or answer from a training course. It's main purpose is to provide an after-course service to customers who have attended our public private or on site courses, but the examples are made generally available under conditions described below.

Web site author
This web site is written and maintained by Well House Consultants.

Conditions of use
Past attendees on our training courses are welcome to use individual examples in the course of their programming, but must check the examples they use to ensure that they are suitable for their job. Remember that some of our examples show you how not to do things - check in your notes. Well House Consultants take no responsibility for the suitability of these example programs to customer's needs.

This program is copyright Well House Consultants Ltd. You are forbidden from using it for running your own training courses without our prior written permission. See our page on courseware provision for more details.

Any of our images within this code may NOT be reused on a public URL without our prior permission. For Bona Fide personal use, we will often grant you permission provided that you provide a link back. Commercial use on a website will incur a license fee for each image used - details on request.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2022: 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/resources/ex.php4 • PAGE BUILT: Sun Oct 11 14:50:09 2020 • BUILD SYSTEM: JelliaJamb