Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
For 2021 - online Python 3 training - see ((here)).

Our plans were to retire in summer 2020 and see the world, but Coronavirus has lead us into a lot of lockdown programming in Python 3 and PHP 7.
We can now offer tailored online training - small groups, real tutors - works really well for groups of 4 to 14 delegates. Anywhere in the world; course language English.

Please ask about private 'maintenance' training for Python 2, Tcl, Perl, PHP, Lua, etc.
XML, HTML, XHTML and more

HTML is a language ... but XML is a Metalanguage. In other words, you can write something in HTML and have it (quite) well defined, whereas anything you write in XML has to have another layer of definition there to tell you what's valid and what isn't. XML is a set of over-arching rules in which you can define your own, XML compliant, language ... or use one that someone else has already defined for you, such as RSS, SOAP, REST, or XSLT or XHTML. Here's a diagram:

It's been said - and it's usually the case - that if you define your data using HTML, then you're defining how it looks, whereas with XML you're defining what it is. For example:

HTML - says how it should appear:

<h1>Melksham Town Center</h1>
<ol><li>Woolworths, Boots, Peacocks and Iceland
<li>All the major banks
<li>Tourist Information Center and Post Office
<li>Bus to Bath, Devizes, Chippenham and Trowbridge

XML - says what it is:

<place>Melksham Town Center
   <item>Boots</item> <item>Peacocks</item>
<banks><item>All the major banks</item></banks>
<general><item>Tourist Information Center</item>
   <item>Post Office</item></general>
<bus><item>Bath</item> <item>Devizes</item>
   <item>Chippenham</item> <item>Trowbridge</bus>

From that, you'll see that you can see how the HTML will be displayed, but you don't know how the XML will be used of displayed. There needs to be a tailored intermediate piece of software specifying that, and doing the work. You may come across:

SAX - The Simple API (Application Programmer Interface) for XML (Extensible Markup Language)

Using SAX, a stream of XML data is passed through a process (scanned by a program) and the interesting bits that the program needs are collected as it does. I've describe this as pouring a lot of water through a sieve, and catching the bits that you want in the sieve. SAX is ideal for getting a few specific elements out of a very large flow of data, but is exceedingly poor for reading XML to edit and re-save it.

DOM - The Document Object Model

In DOM, Data is parsed into a structure in memory. An XML document is a series of tags ... so each of those becomes an array or list (depending on the programming language that you're using), and within each of those you have other tags which in turn become further arrays or lists within the first. Attributes - not touched on in this article - become hashes, dictionaries or associative arrays, and the text data is stored as strings in the arrays. So this translates from file to something that can be held in memory and, with carefully written recursive code, manipulated very flexibly indeed. DOM is good for smaller data sets, and it's a great tool if you want to edit and save changes to your original XML. It's not going to work for you if you have an enormous XML file.

XSLT - XML Style Sheet Language Transforms

XSLT is a language which allows you to specify how your XML is to be transformed, SAX style, as you parse it. You can write formatting information, tags, loops and all the other things you're used to in XSLT ... and the result of an XSLT transform is likely to be XHTML. Let's say you have 60 staff, with an XML file holding records for each of them. And you want to display the data in 3 different ways. Then you'll write 3 XSLT files to define the mappings, and the result will be that you can get any of your 60 x 3 (=180) possible displays. XSLT happens to be itself defined to the XML standard ...


Apache Cocoon is a system that allows you to take XML and transform it into different formats for different purposes - taking my "staff record" example again, I could set up Cocoon to give me postscript files for printing, XHTML for display, selective XML for public release via a news feed, pdf for producing a flyer about the employee, and so on.

And how does XHTML fit into this?

XHTML is HTML with the additional rules of XML enforced - so that although you're laying out how things are to be displayed rather than what they are, you're also specifying that in a consistent form that's easy to edit with HTML editors and will cause less headaches as you view your page on different browsers - assuming you stick with standard tags!

Our example in XHTML:

<h1>Melksham Town Center</h1>
<ol><li>Woolworths, Boots, Peacocks and Iceland</li>
<li>All the major banks</li>
<li>Tourist Information Center and Post Office</li>
<li>Bus to Bath, Devizes, Chippenham and Trowbridge</li>

(written 2008-11-23)

Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
A301 - Web Application Deployment - XML, DTD, XSLT, XHTML and More
  [653] Easy feed! - (2006-03-21)
  [1050] The HTML++ Metalanguage - (2007-01-22)
  [2246] What difference does using the XHTML standard really make? - (2009-06-18)
  [2378] Handling XML in Perl - introduction and early examples - (2009-08-27)
  [2554] Adding retrospective ALT attributes to IMG - (2009-12-28)

Back to
Table Topics
Previous and next
Horse's mouth home
Forward to
sstrwxrwxrwx - Unix and Linux file permissions
Some other Articles
A Gold Star for First Great Western Customer Service
Ruby, Perl, Linux, MySQL - some training notes
daemons - what is running on my Linux server?
sstrwxrwxrwx - Unix and Linux file permissions
XML, HTML, XHTML and more
Table Topics
Virtual Hosting under Tomcat - an example
Every cloud has a silver lining
Keeping on an even keel
Virtual Hosts and Virtual Servers
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page

This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2021: 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/1901_XML ... -more.html • PAGE BUILT: Sun Oct 11 16:07:41 2020 • BUILD SYSTEM: JelliaJamb