Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
 
For 2021 - online Python 3 training - see ((here)).

Our plans were to retire in summer 2020 and see the world, but Coronavirus has lead us into a lot of lockdown programming in Python 3 and PHP 7.
We can now offer tailored online training - small groups, real tutors - works really well for groups of 4 to 14 delegates. Anywhere in the world; course language English.

Please ask about private 'maintenance' training for Python 2, Tcl, Perl, PHP, Lua, etc.
Crossrefering documents with uniqueness and inconsistency issues - PHP proof of concept demo

The cross-referencing of documents - where one document contains a human readable reference to another - is very common. And so is the requirement to convert those cross-references into a appropriate links.

The Example

Let's say that I have chunks of data of up to 160k, relating to up to 4000 images, as output from some sort of search. Some of data lines have reference codes ("see also" stuff) attached to them. And I want to come up with a table of the "see also"s There can be several attached to a single line of data, and the formatting is inconsistent because it's manually updated. And a "see also" reference can occur a number of times. Sound like s**t loads of potential problems?

Here is some sample data:

train.jpg A train pulls in to Melksham Station
eiders.jpg Eider ducks (BIRD 15/025/35)
diffdir.jpg A view full of terns (BIRD 15/44.3/19)
pufffly.jpg Puffin in flight (BIRD 33/23/33)
puffnests.jpg Puffins nesting (BIRD 33/23/ 33)
gullery.jpg Gulls nesting at Vik (BIRD 44/32/123)
yern2.jpg A tern searches for food ( BIRD 15/44/19)
DSC08354a.jpg Santa at Melksham Station (PERS 43/22/12)
DSC08379.jpg Father Christmas at Melksham (PERS 43/022/12)
leveepics.jpg Lisa snaps up California (PERS 36/26/36)
avbr6.jpg Ascending a lower lock
newb_2.jpg Gypsy and Graham (ANIM 33/23/77) (PERS 12/73/17)
newb_3.jpg Gypsy (ANIM 33/23/77)
avbr4.jpg Lower down the locks at Caen Hill
avbr1.jpg Avebury Church


And I want to get a table of all the cross-referenced subjects (e.g. PERS 43/22/12 - Santa Claus) and all the data lines that each relates to.

The Solution

I have written a "proof of concept" program in PHP which analyses my data in the format above and produces a table of each reference, with a link to the image that it refers to.

The principle of how it works is as follows ...

The PHP program reads the data source record by record, and finds any references within each record. Each reference is canonicalised (reduced to a standard form for consistency) and stored into an associative array of results, so that we can spot non-unique references instantly and deal with them.

Once we have parsed the whole data source, all we need to do is to sort the array of references, and loop through the output to display it.

It sounds and IS simple ... but the first time you do something like this you have to be very careful to use the correct, efficient technique to deal with the uniqueness and irregularity of manual input issues.

Full data file: here
Source code of example: here
Run the example: here

Come and learn about it on this course which runs here
(written 2009-05-10)

 
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
H308 - PHP - Searches, and search engines
  [1020] Parallel processing in PHP - (2007-01-03)
  [1735] Finding words and work boundaries (MySQL, Perl, PHP) - (2008-08-03)
  [2137] Reaching the right people with your web site - (2009-04-23)
  [2631] How to show a large result set page by page in PHP - (2010-02-11)
  [3159] Returning multiple values from a function call in various languages - a comparison - (2011-02-06)
  [3163] Twitter - the special use of @ # and http: in tweets - (2011-02-09)
  [4401] Selecting RECENT and POPULAR news and trends for your web site users - (2015-01-19)


Back to
Making Regular Expressions easy to read and maintain
Previous and next
or
Horse's mouth home
Forward to
Watching the tele
Some other Articles
Designing a heirarcy of classes - getting inheritance right
When should I use OO techniques?
In honour of the photograph, I present ... a walk from Reybridge to Lacock
Watching the tele
Crossrefering documents with uniqueness and inconsistency issues - PHP proof of concept demo
Making Regular Expressions easy to read and maintain
Updating my public profile - Graham Ellis
CATALINA_OPTS v JAVA_OPTS - What is the difference?
Admins thoughts on banning a member from a forum
Get it right ... if it goes wrong, it takes so much effort to sort out!
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2021: 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/2166_Cro ... -demo.html • PAGE BUILT: Sun Oct 11 16:07:41 2020 • BUILD SYSTEM: JelliaJamb