Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
 
For 2023 (and 2024 ...) - we are now fully retired from IT training.
We have made many, many friends over 25 years of teaching about Python, Tcl, Perl, PHP, Lua, Java, C and C++ - and MySQL, Linux and Solaris/SunOS too. Our training notes are now very much out of date, but due to upward compatability most of our examples remain operational and even relevant ad you are welcome to make us if them "as seen" and at your own risk.

Lisa and I (Graham) now live in what was our training centre in Melksham - happy to meet with former delegates here - but do check ahead before coming round. We are far from inactive - rather, enjoying the times that we are retired but still healthy enough in mind and body to be active!

I am also active in many other area and still look after a lot of web sites - you can find an index ((here))
Storing your intermediate data - what format should you you choose?

Many applications require data to be held at intermediate stages - stored. What format should be used? ... HUGE subject.

1. If there is already an industry standard / draft standard way of doing it, think very carefully before going for anything else. The standard will have been designed with ease of use for the particular applications in mind, and the designers will already have considered pitfalls. And if you use a standard format, you're also likely to be able to use a lot of utility programs to use that data that others have written already. If that doesn't work for you ...

2. Do you need to edit it in situ and have lots of people potentially making changes to it at the same time? If so, some sort of database - SQL or NoSQL would be worth looking at, especially if the data is heavily structured.

3. Is the data somewhat free format in that you have various different fields in different records, without many records being complete. If this is the case and you want a readable, sharable file structure then you might want to look at XML or JSON or some other key / value type format.

Data often takes the form of "records" which each have a number of fields in them ... each record of a similar format, and with a limited need to get back in and interactively edit the file. And reading sequentially from end to end may be fine. In which case:

4. Plain text file. Unless your certain about the maximum size of every field on the line, I would suggest that these days you go for a file in which each field is separated by a "cardinal character" - in other words, a character which is special and cannot occur in the content of any field. Commonly used cardinal characters are space, tab and comma. I've also come across colon and semicolon.

If there's ANY chance of the cardinal character appearing in any field, you need to adjust the format. A typical "CSV" (Comma Separated Values) file allows for commas within each field, but the fields must that contact commas as data must then be surrounded by quotes, which in turn means that if you want fields to contain quotes, you need to do something about then. The usual way is to make backslash special - with \" meaning "I really want a " " and also \\ meaning "I really want a \". It's usually much easier to use a tab character as separator, especially if it's never going to be contained in the data. This does mean you have to be careful if manually editing the file ...

Further suggestions / notes:

a) If you possibly can, write code that reads the file to ignore blank lines, and lines that start with # characters. In Perl - something like this:
  next if ($lyne =~ /^\s*#/ or $lyne =~ /^\s*$/);
in your reading loop. That way, you can edit your data, space out groups of records, and add in comments to describe the format and make other points to anyone who comes along to read the data later on

b) If you're going to have lots of data files of the same type, if you're going to keep the files for a while, if you're going to pass the files onto others, it's a good idea to provide some sort of internal labelling about what the file is - don't rely on the name. This could be done simply by adding a comment line at the top of the file when you write it (see (a) just above) or you could add a separate header line or header block.

c) Where data integrity and completeness is of cardinal importance, you might want to consider "start of data" and "end of data" records to avoid any future problems with truncated data files.

5. Options (1) through (4) won't deal with every scenario. There may be times that you'll store XML in a database, that you'll go for fixed length records, binary encoding and all sorts of other things. You might want to write directly to spread sheet files or produce .pdf documents or even graphics which contain your data within barcodes or QR codes .... you may decide on a folder/directory with a series of individual files therein, you may package up lots of elemental files into a .zip / .jar file. Like I said at the start, huge subject, no single solution.

Very often on our programming courses, we'll look at customer's individual data requirements and help guide that customer through the start of the process to work out his various formats - after all, this comes very much at the start of the UML design process - "who provides the data, what's done with it, what are the results for whom".
(written 2012-11-20, updated 2012-11-24)

 
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
S151 - Introduction to SQL and MySQL
  [84] MySQL - nuggets - (2004-10-11)
  [85] Present and future MySQL - (2004-10-12)
  [175] Worthwhile - (2005-01-11)
  [382] Central London Courses - Perl, PHP, Python, Tcl, MySQL - (2005-07-18)
  [444] Database or Progamming - which to learn first? - (2005-09-13)
  [515] MySQL - an FAQ - (2005-12-03)
  [591] Key facts - SQL and MySQL - (2006-02-04)
  [691] Testing you Perl / PHP / MySQL / Tcl knowledge - (2006-04-19)
  [918] Databases needn't be frightening, hard or expensive - (2006-11-08)
  [924] The LAMP Cookbook - Linux, Apache, MySQL, PHP / Perl - (2006-11-13)
  [2134] Oracle take over Sun who had taken over MySQL - (2009-04-21)
  [2561] The future of MySQL - (2010-01-03)
  [2567] Extra MySQL course dates (2 day course, UK) - (2010-01-08)
  [2786] Factory methods and SqLite in use in a Python teaching example - (2010-05-29)
  [3269] Files or Databases? MySQL, SQLite, or Oracle? - (2011-04-28)

Q907 - Object Orientation and General technical topics - Object Orientation: Design Techniques
  [80] OO - real benefits - (2004-10-09)
  [236] Tapping in on resources - (2005-03-05)
  [507] Introduction to Object Oriented Programming - (2005-11-27)
  [534] Design - one name, one action - (2005-12-19)
  [656] Think about your design even if you don't use full UML - (2006-03-24)
  [747] The Fag Packet Design Methodology - (2006-06-06)
  [831] Comparison of Object Oriented Philosophy - Python, Java, C++, Perl - (2006-08-13)
  [836] Build on what you already have with OO - (2006-08-17)
  [1047] Maintainable code - some positive advice - (2007-01-21)
  [1217] What are factory and singleton classes? - (2007-06-04)
  [1224] Object Relation Mapping (ORM) - (2007-06-09)
  [1435] Object Oriented Programming in Perl - Course - (2007-11-18)
  [1528] Object Oriented Tcl - (2008-02-02)
  [1538] Teaching Object Oriented Java with Students and Ice Cream - (2008-02-12)
  [2169] When should I use OO techniques? - (2009-05-11)
  [2170] Designing a heirarcy of classes - getting inheritance right - (2009-05-11)
  [2327] Planning! - (2009-08-08)
  [2380] Object Oriented programming - a practical design example - (2009-08-27)
  [2501] Simples - (2009-11-12)
  [2523] Plan your application before you start - (2009-12-02)
  [2717] The Multiple Inheritance Conundrum, interfaces and mixins - (2010-04-11)
  [2741] What is a factory? - (2010-04-26)
  [2747] Containment, Associative Objects, Inheritance, packages and modules - (2010-04-30)
  [2785] The Light bulb moment when people see how Object Orientation works in real use - (2010-05-28)
  [2865] Relationships between Java classes - inheritance, packaging and others - (2010-07-10)
  [2878] Program for reliability and efficiency - do not duplicate, but rather share and re-use - (2010-07-19)
  [2889] Should Python classes each be in their own file? - (2010-07-27)
  [2953] Turning an exercise into the real thing with extreme programming - (2010-09-11)
  [2977] What is a factory method and why use one? - Example in Ruby - (2010-09-30)
  [3063] Comments in and on Perl - a case for extreme OO programming - (2010-11-21)
  [3085] Object Oriented Programming for Structured Programmers - conversion training - (2010-12-14)
  [3260] Ruby - a training example that puts many language elements together to demonstrate the whole - (2011-04-23)
  [3454] Your PHP website - how to factor and refactor to reduce growing pains - (2011-09-24)
  [3607] Designing your application - using UML techniques - (2012-02-11)
  [3760] Why you should use objects even for short data manipulation programs in Ruby - (2012-06-10)
  [3763] Spike solutions and refactoring - a Python example - (2012-06-13)
  [3798] When you should use Object Orientation even in a short program - Python example - (2012-07-06)
  [3844] Rooms ready for guests - each time, every time, thanks to good system design - (2012-08-20)
  [3878] From Structured to Object Oriented Programming. - (2012-10-02)
  [3887] Inheritance, Composition and Associated objects - when to use which - Python example - (2012-10-10)
  [3978] Teaching OO - how to avoid lots of window switching early on - (2013-01-17)
  [4098] Using object orientation for non-physical objects - (2013-05-22)
  [4374] Test driven development, and class design, from first principles (using C++) - (2014-12-30)
  [4430] The spirit of Java - delegating to classes - (2015-02-18)
  [4449] Spike solution, refactoring into encapsulated object methods - good design practise - (2015-03-05)
  [4628] Associative objects - one object within another. - (2016-01-20)

Q101 - Object Orientation and General technical topics - Programming Principles
  [2001] I have not programmed before, and need to learn - (2009-01-19)
  [2022] Pre and post increment - the ++ operator - (2009-02-03)
  [2228] Where do I start when writing a program? - (2009-06-11)
  [2310] Learning to write high quality code in Lua - (2009-07-30)
  [2415] Variable names like i and j - why? - (2009-09-22)
  [2510] The music of the stock market - (2009-11-22)
  [2550] Do not copy and paste code - there are much better ways - (2009-12-26)
  [2586] And and Or illustrated by locks - (2010-01-17)
  [2737] Improving your function calls (APIs) - General and PHP - (2010-04-24)
  [2769] Easy - but for whom? - (2010-05-18)
  [2915] Looking up a value by key - associative arrays / Hashes / Dictionaries - (2010-08-11)
  [2964] An introduction to file handling in programs - buffering, standard in and out, and file handles - (2010-09-21)
  [3026] Coding efficiency - do not repeat yourself! - (2010-11-02)
  [3456] Stepping stones - early coding, and writing re-usable code quickly - (2011-09-24)
  [3542] What order are operations performed in, in a Perl expression? - (2011-12-07)
  [3548] Dark mornings, dog update, and Python and Lua courses before Christmas - (2011-12-10)
  [3551] Some terms used in programming (Biased towards Python) - (2011-12-12)
  [3673] Object oriented or structured - a comparison in Python. Also writing clean regular expressions - (2012-03-26)
  [3954] Lesson 1 in programing - write clean, reuseable and maintainable tidy code - (2012-12-16)
  [4003] Web and console - same principle, same code - Ruby example - (2013-02-14)
  [4061] Seamless, integrated IT - we have a long way to go! - (2013-04-11)
  [4090] Test Driven Development in Python - Customer Comes First - (2013-05-16)
  [4118] We not only teach PHP and Python - we teach good PHP and Python Practice! - (2013-06-18)
  [4153] Rooms available tonight - how to code an algorithm from first principles - (2013-08-19)
  [4206] Writing the perfect program in Tcl? - (2013-11-13)
  [4325] Learning to program - what are algorithms and design patterns? - (2014-11-22)
  [4611] Hungarian, Camel, Snake and Kebab - variable naming conventions - (2016-01-03)
  [4632] Remember to ask the question before you listen for the answer - (2016-01-26)
  [4645] What are callbacks? Why use them? An example in Python - (2016-02-11)


Back to
First match or all matches? Perl Regular Expressions
Previous and next
or
Horse's mouth home
Forward to
Melksham Bus Issues - to be raised at First Bus Customer Panel
Some other Articles
River nearly bursting its banks in Melksham
Optional positional and named parameters in Python
Reporting the full stack trace when you catch a Python exception
Melksham Bus Issues - to be raised at First Bus Customer Panel
Storing your intermediate data - what format should you you choose?
First match or all matches? Perl Regular Expressions
Filtering PHP form inputs - three ways, but which should you use?
Red sky at night
The bedrooms at Well House Manor
35 minutes is only a slight delay on our railway service
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2024: 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/3928_Sto ... oose-.html • PAGE BUILT: Sun Oct 11 16:07:41 2020 • BUILD SYSTEM: JelliaJamb