Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
 
Python, Lua and Tcl - public course schedule [here]
Private courses on your site - see [here]
Please ask about maintenance training for Perl, PHP, Java, C, C++, Ruby, MySQL and Linux / Tomcat systems

Phone System reconfiguration 11.1.2018 to 26.1.2018 - we are on email but incoming landline not available. Temp number 0797 4 925928
 
Storing your intermediate data - what format should you you choose?

Many applications require data to be held at intermediate stages - stored. What format should be used? ... HUGE subject.

1. If there is already an industry standard / draft standard way of doing it, think very carefully before going for anything else. The standard will have been designed with ease of use for the particular applications in mind, and the designers will already have considered pitfalls. And if you use a standard format, you're also likely to be able to use a lot of utility programs to use that data that others have written already. If that doesn't work for you ...

2. Do you need to edit it in situ and have lots of people potentially making changes to it at the same time? If so, some sort of database - SQL or NoSQL would be worth looking at, especially if the data is heavily structured.

3. Is the data somewhat free format in that you have various different fields in different records, without many records being complete. If this is the case and you want a readable, sharable file structure then you might want to look at XML or JSON or some other key / value type format.

Data often takes the form of "records" which each have a number of fields in them ... each record of a similar format, and with a limited need to get back in and interactively edit the file. And reading sequentially from end to end may be fine. In which case:

4. Plain text file. Unless your certain about the maximum size of every field on the line, I would suggest that these days you go for a file in which each field is separated by a "cardinal character" - in other words, a character which is special and cannot occur in the content of any field. Commonly used cardinal characters are space, tab and comma. I've also come across colon and semicolon.

If there's ANY chance of the cardinal character appearing in any field, you need to adjust the format. A typical "CSV" (Comma Separated Values) file allows for commas within each field, but the fields must that contact commas as data must then be surrounded by quotes, which in turn means that if you want fields to contain quotes, you need to do something about then. The usual way is to make backslash special - with \" meaning "I really want a " " and also \\ meaning "I really want a \". It's usually much easier to use a tab character as separator, especially if it's never going to be contained in the data. This does mean you have to be careful if manually editing the file ...

Further suggestions / notes:

a) If you possibly can, write code that reads the file to ignore blank lines, and lines that start with # characters. In Perl - something like this:
  next if ($lyne =~ /^\s*#/ or $lyne =~ /^\s*$/);
in your reading loop. That way, you can edit your data, space out groups of records, and add in comments to describe the format and make other points to anyone who comes along to read the data later on

b) If you're going to have lots of data files of the same type, if you're going to keep the files for a while, if you're going to pass the files onto others, it's a good idea to provide some sort of internal labelling about what the file is - don't rely on the name. This could be done simply by adding a comment line at the top of the file when you write it (see (a) just above) or you could add a separate header line or header block.

c) Where data integrity and completeness is of cardinal importance, you might want to consider "start of data" and "end of data" records to avoid any future problems with truncated data files.

5. Options (1) through (4) won't deal with every scenario. There may be times that you'll store XML in a database, that you'll go for fixed length records, binary encoding and all sorts of other things. You might want to write directly to spread sheet files or produce .pdf documents or even graphics which contain your data within barcodes or QR codes .... you may decide on a folder/directory with a series of individual files therein, you may package up lots of elemental files into a .zip / .jar file. Like I said at the start, huge subject, no single solution.

Very often on our programming courses, we'll look at customer's individual data requirements and help guide that customer through the start of the process to work out his various formats - after all, this comes very much at the start of the UML design process - "who provides the data, what's done with it, what are the results for whom".
(written 2012-11-20, updated 2012-11-24)

 
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
Q101 - Object Orientation and General technical topics - Programming Principles
  [4645] What are callbacks? Why use them? An example in Python - (2016-02-11)
  [4632] Remember to ask the question before you listen for the answer - (2016-01-26)
  [4611] Hungarian, Camel, Snake and Kebab - variable naming conventions - (2016-01-03)
  [4325] Learning to program - what are algorithms and design patterns? - (2014-11-22)
  [4206] Writing the perfect program in Tcl? - (2013-11-13)
  [4153] Rooms available tonight - how to code an algorithm from first principles - (2013-08-19)
  [4118] We not only teach PHP and Python - we teach good PHP and Python Practice! - (2013-06-18)
  [4090] Test Driven Development in Python - Customer Comes First - (2013-05-16)
  [4061] Seamless, integrated IT - we have a long way to go! - (2013-04-11)
  [4003] Web and console - same principle, same code - Ruby example - (2013-02-14)
  [3954] Lesson 1 in programing - write clean, reuseable and maintainable tidy code - (2012-12-16)
  [3878] From Structured to Object Oriented Programming. - (2012-10-02)
  [3673] Object oriented or structured - a comparison in Python. Also writing clean regular expressions - (2012-03-26)
  [3551] Some terms used in programming (Biased towards Python) - (2011-12-12)
  [3548] Dark mornings, dog update, and Python and Lua courses before Christmas - (2011-12-10)
  [3542] What order are operations performed in, in a Perl expression? - (2011-12-07)
  [3456] Stepping stones - early coding, and writing re-usable code quickly - (2011-09-24)
  [3026] Coding efficiency - do not repeat yourself! - (2010-11-02)
  [2964] An introduction to file handling in programs - buffering, standard in and out, and file handles - (2010-09-21)
  [2915] Looking up a value by key - associative arrays / Hashes / Dictionaries - (2010-08-11)
  [2878] Program for reliability and efficiency - do not duplicate, but rather share and re-use - (2010-07-19)
  [2769] Easy - but for whom? - (2010-05-18)
  [2737] Improving your function calls (APIs) - General and PHP - (2010-04-24)
  [2586] And and Or illustrated by locks - (2010-01-17)
  [2550] Do not copy and paste code - there are much better ways - (2009-12-26)
  [2510] The music of the stock market - (2009-11-22)
  [2415] Variable names like i and j - why? - (2009-09-22)
  [2327] Planning! - (2009-08-08)
  [2310] Learning to write high quality code in Lua - (2009-07-30)
  [2228] Where do I start when writing a program? - (2009-06-11)
  [2022] Pre and post increment - the ++ operator - (2009-02-03)
  [2001] I have not programmed before, and need to learn - (2009-01-19)

Q907 - Object Orientation and General technical topics - Object Orientation: Design Techniques
  [4628] Associative objects - one object within another. - (2016-01-20)
  [4449] Spike solution, refactoring into encapsulated object methods - good design practise - (2015-03-05)
  [4430] The spirit of Java - delegating to classes - (2015-02-18)
  [4374] Test driven development, and class design, from first principles (using C++) - (2014-12-30)
  [4098] Using object orientation for non-physical objects - (2013-05-22)
  [3978] Teaching OO - how to avoid lots of window switching early on - (2013-01-17)
  [3887] Inheritance, Composition and Associated objects - when to use which - Python example - (2012-10-10)
  [3844] Rooms ready for guests - each time, every time, thanks to good system design - (2012-08-20)
  [3798] When you should use Object Orientation even in a short program - Python example - (2012-07-06)
  [3763] Spike solutions and refactoring - a Python example - (2012-06-13)
  [3760] Why you should use objects even for short data manipulation programs in Ruby - (2012-06-10)
  [3607] Designing your application - using UML techniques - (2012-02-11)
  [3454] Your PHP website - how to factor and refactor to reduce growing pains - (2011-09-24)
  [3260] Ruby - a training example that puts many language elements together to demonstrate the whole - (2011-04-23)
  [3085] Object Oriented Programming for Structured Programmers - conversion training - (2010-12-14)
  [3063] Comments in and on Perl - a case for extreme OO programming - (2010-11-21)
  [2977] What is a factory method and why use one? - Example in Ruby - (2010-09-30)
  [2953] Turning an exercise into the real thing with extreme programming - (2010-09-11)
  [2889] Should Python classes each be in their own file? - (2010-07-27)
  [2865] Relationships between Java classes - inheritance, packaging and others - (2010-07-10)
  [2785] The Light bulb moment when people see how Object Orientation works in real use - (2010-05-28)
  [2747] Containment, Associative Objects, Inheritance, packages and modules - (2010-04-30)
  [2741] What is a factory? - (2010-04-26)
  [2717] The Multiple Inheritance Conundrum, interfaces and mixins - (2010-04-11)
  [2523] Plan your application before you start - (2009-12-02)
  [2501] Simples - (2009-11-12)
  [2380] Object Oriented programming - a practical design example - (2009-08-27)
  [2170] Designing a heirarcy of classes - getting inheritance right - (2009-05-11)
  [2169] When should I use OO techniques? - (2009-05-11)
  [1538] Teaching Object Oriented Java with Students and Ice Cream - (2008-02-12)
  [1528] Object Oriented Tcl - (2008-02-02)
  [1435] Object Oriented Programming in Perl - Course - (2007-11-18)
  [1224] Object Relation Mapping (ORM) - (2007-06-09)
  [1217] What are factory and singleton classes? - (2007-06-04)
  [1047] Maintainable code - some positive advice - (2007-01-21)
  [836] Build on what you already have with OO - (2006-08-17)
  [831] Comparison of Object Oriented Philosophy - Python, Java, C++, Perl - (2006-08-13)
  [747] The Fag Packet Design Methodology - (2006-06-06)
  [656] Think about your design even if you don't use full UML - (2006-03-24)
  [534] Design - one name, one action - (2005-12-19)
  [507] Introduction to Object Oriented Programming - (2005-11-27)
  [236] Tapping in on resources - (2005-03-05)
  [80] OO - real benefits - (2004-10-09)

S151 - Introduction to SQL and MySQL
  [3269] Files or Databases? MySQL, SQLite, or Oracle? - (2011-04-28)
  [2786] Factory methods and SqLite in use in a Python teaching example - (2010-05-29)
  [2567] Extra MySQL course dates (2 day course, UK) - (2010-01-08)
  [2561] The future of MySQL - (2010-01-03)
  [2134] Oracle take over Sun who had taken over MySQL - (2009-04-21)
  [924] The LAMP Cookbook - Linux, Apache, MySQL, PHP / Perl - (2006-11-13)
  [918] Databases needn't be frightening, hard or expensive - (2006-11-08)
  [691] Testing you Perl / PHP / MySQL / Tcl knowledge - (2006-04-19)
  [591] Key facts - SQL and MySQL - (2006-02-04)
  [515] MySQL - an FAQ - (2005-12-03)
  [444] Database or Progamming - which to learn first? - (2005-09-13)
  [382] Central London Courses - Perl, PHP, Python, Tcl, MySQL - (2005-07-18)
  [175] Worthwhile - (2005-01-11)
  [85] Present and future MySQL - (2004-10-12)
  [84] MySQL - nuggets - (2004-10-11)


Back to
First match or all matches? Perl Regular Expressions
Previous and next
or
Horse's mouth home
Forward to
Melksham Bus Issues - to be raised at First Bus Customer Panel
Some other Articles
River nearly bursting its banks in Melksham
Optional positional and named parameters in Python
Reporting the full stack trace when you catch a Python exception
Melksham Bus Issues - to be raised at First Bus Customer Panel
Storing your intermediate data - what format should you you choose?
First match or all matches? Perl Regular Expressions
Filtering PHP form inputs - three ways, but which should you use?
Red sky at night
The bedrooms at Well House Manor
35 minutes is only a slight delay on our railway service
4758 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2018: 404 The Spa • Melksham, Wiltshire • United Kingdom • SN12 6QL
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/3928_Sto ... oose-.html • PAGE BUILT: Sat May 27 16:49:10 2017 • BUILD SYSTEM: WomanWithCat