The VERY basics of a web page ... and web site
"Go look after this site - it's quite a small one". But even a small web site can have a lot of complexity these days, and that can often hide the basics.
When a user calls up a web page, he's asking the browser on his computer (typically Firefox, Chrome or Internet Explorer; maybe Safari or Opera) to contact a server computer (i.e. a computer running server software) and collect a page of HTML (HyperText Markup Language) for display. This page of HTML - in its most straightforward form - is held on the server computer as a file of plain text, and the server simply returns it to the browser when asked.
What's in a page of HTML?
An HTML page is split into two main sections - the head
which contains information that the server wants to tell the browser about the page and its content, and the body
Elements within the page are marked up (described) with tags
which provide information to the browser about how to handle the data. These tags start with a < (less than) symbol, and end with a > (greater than symbol). There's an opening and a closing tag for each markup element, with the closing tag having a /
in addition to the same keyword as the opening tag. Let's put that together so far:
<body><p>A very simple web page</p></body>
Some extra things you'll want to know early on
a) There's a shorthand you can use for a tag which opens and immediately close - a single tag with a trailing slash. So
b) Tags will often require extra information ("attributes") to describe how they're to work, and these are written before the >, in the form of
where the names are from a predefined list which is different for each type of tag, and the value is in quote marks. Example:
c) Where there's a requirement within the text for a character that means something special to the tag interpretter (e.g. <), you can use an ampersand (&) followed by a 2/3/4/5 letter abbreviation and a semicolon. This also works for some special characters not in the standard set. Examples:
How do I do Images, Links, and Forms?
All of these are done using tags. In the case of images, it's an <img> tag with an attribute to give the URL of where the image is to bel loaded from, and that loading is a separate request back to the server. Links to other pages are done with an <a> tag with an attribute to give the URL that's to be linked to when the user clicks on the link.
Forms are written between a <form> and </form> tag pair, with attributes giving the URL of the page that's to process the data when it's submitted (and how that data is to be submitted too) and then other tags are provided for use between the open and close tags which provide the various form elements such as input boxes, check boxes, text areas, selectors, etc.
<img src="http://www.welho.net/pix/closebell.jpg" align="right" />
<a href="http://www.wwuu.co.uk/museum.html">our museum</a>
<form action="http://www.wellho.net/net/quote.html"><input name="where" size="6" /><input type="submit" /></form>
How do I do bold, italic, coloured text, and so on?
Tags such as <b> to </b> were traditionally used to indicate bold, and there are hosts of others too. But a more modern alternative is to set a whole load of such text display elements in one go, by defining a whole lot of settings for a particular use of text through a style
, with these styles defined in style sheets
, with overall styles and settings defined first and then with more specialised styles set for more specific uses.
Styles and style sheets expand beyond the look and feel of characters and character groups to the whole page layout - for which you would traditionally have used <table> tags, and on some occasions still will. Each of these is very much a topic in its own right!
Do I really need to know about all of these tags as I code?
Not really for many sites these days - tools such as content management systems will allow most web site content providers to enter data that's to go within a standard (templated) look and feel on a web site to be entered at a higher level, with softwate such as Wordpress, Joomla, Drupal (or one of dozens more - see [here]
) doing most of the hard work. But is is useful for users to have something of an idea of what goes on.
What else should I know about (or at least be aware of!)?
• On the simplest of setups, a request for a page by name results in a file of that same name within the web site directory on the web server being returned and displayed. However, routing instructions
can be provided on the web server to divert requests elsewhere, including to programs which grab the actual HTML from databases, or make up pages from templates and database records dynamically.
• If a browser calls up a directory / folder rather than an individual file, the server will return the "home page" for that directory - usually a file called index.html. If that home page does not exist, an error may be generated, or a directory listing displayed, depending on how the web server is configured.
• A very high proportion of traffic to your web site will be from automated programs such as search engines, which visit pages all around the web to index them and then (later on) send people to you if you match their searches. Such automated programs will follow all of your links to find all of your pages. If you want to advise the spiders / robots where they should NOT go, you can do so via a file called robots.txt which well-mannered spiders will check from time to time.
• The little icon that goes in the bookmark and location bar comes from the favicon.ico file if you have one.
• If the server can't provide what the browser has asked for because there'e something wrong with the request, an error page is generated which is sent out with a special code to tell the browser what the type of error is. The most common of these is 404 - "file not found" and others are in the 400 series. If a request reveals an error within the server, it will return a 500 series error page.
• If the web server sees certain patterns in the URL, it may be configured to run the file that it points to as a program and return the result of running that program. You'll see words like "cgi" (Common Gateway Interface) and .php used for such programs.
• A single web server may serve many different web sites, each with its own set of files and directories. When that's happening, each of the sites hosted is known as a "virtual server".
• Browser requests for pages / resources within a page won't always result in a page being sent out - if the request duplicates one made a little earlier, the browser may simply use a stored (cached) page, or the server may just tell the browser that it's not changed. If you get "cache headers" wrong, this can result in the display of outated content.
• Away from the web site pages themselves, you should be aware of the following if you're looking after a site:
- the log files, which may record each access to each URL for you
- you should have good backups of your website, and remember to keep them up to date if you have user contributed content being added
- you might want to keep track of the number and pattern of accesses, and how many visitors the site can handle at onceA
If you're lookiing for information as to the most popular browsers, servers and CMS Systems, here are some links:
Content Management Systems: http://www.makeuseof.com/tag/10-popular-content-management-systems-online/
Servers: http://w3techs.com/blog/entry/most_popular_web_servers_by_country (written 2013-03-09, updated 2013-03-30)
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articlesW502 - Web and Intranet - HTML Document Anatomy 
Strikingly busy - (2006-09-06) 
Positioning with Cascading Style Sheets - (2006-12-16) 
HTML - example of a simple web page - (2007-04-24) 
Updating a page strictly every minute (PHP, Perl) - (2007-05-14) 
Getting the community on line - some basics - (2007-11-13) 
All the special characters in HTML ... - (2007-12-07) 
Alternative URLs using % symbol encoding - (2008-09-27) 
What difference does using the XHTML standard really make? - (2009-06-18) 
Redirecting to your main domain for correct security keys - (2010-03-13) 
Tags used in writing this blog - (2011-11-12) 
How big is a web page these days? Does the size of your pages matter? - (2011-12-26)Q624 - Object Orientation and General technical topics - HTML - An Overview 
FTP - how to make the right transfers - (2005-09-01) 
Text formating for HTML, with PHP - (2008-10-11)
Some other Articles
Using Pygments to colour our training examplesCascading Style Sheets and formatting your web pageHTML tags uses in these blog articlesSpecial characters in HTMLThe VERY basics of a web page ... and web siteOfficial Star ratings for hotels - still worth having?Easier public transport from Melksham to Bristol AirportShowing what programming errors look like - web site pitfallWhat is on OUR pond?Exception, Lambda, Generator, Slice, Dict - examples in one Python program