In a web scenario, client to server traffic is usually carried using an
http (HyperText Transfer Protocol) transport. That's both from browser to public facing server, but also in ongoing transfers from the public facing server to other servers which provide content or run business logic in many applications.
But you'll note that I said "usually" - there
are other transports that are available and used. The first group are those which transport
the same data as http - specifically https and ajp. let's start off with describing what's in http.
What is http?
An
http request comprises a series of lines of data, each new line terminated. The first of these lines comprises the request method (such as GET or POST) followed by the name of the resource required (such as /index.html) followed by a protocol version (such as HTTP/1.1). Subsequent lines include such things as the name of the host being contacted, referrer headers, cookies, the type of the browser, preferred language, and a whole host more details. In HTTP/1.1 only the name of the host being contacted is required in subsequent lines - the rest are conditional or optional. In the case of the POST method, the header is followed by the data that's associated with the request. An http requested is followed by a blank line which indicated that it is complete.
A server processes an http request and sends out a response. The response comprises a header block, a blank line, and (in most cases) a data block. The first line of the header includes a response code which indicates the success or otherwise of the request - a 3 digit number in the following ranges:
200 and up - success; expect good data to follow
300 and up - good request but only headers (no data). e.g. page has moved
400 and up - error in request. e.g. request was for missing page (404)
500 and up - error in handling request. e.g. program on server has syntax error
This line of the header block is followed by other headers telling the receiving system the content type (Mime type) which allows that receiving system to know whether to handle it as HTML, and a JPEG image, etc. Then there's a blank line and the actual data.
As there are often multiple requests made from the same client to the same server in quick succession (for example a web page will call up images), the connection often stays alive for a few seconds under HTTP/1.1.
See
http protocol specification for further details
So what is https?
The
https protocol carries the same information as http, but adds to it a secure socket layer (SSL). In other words, the data is encrypted at the client and decrypted at the server, and then the same happens in reverse. The purpose of this encryption is to ensure that stray data packets that are viewed along the way are no use the person who has them - they're uninterpretable binary data.
The https scheme is quite complicated - it starts off with the client having to establish that it's really talking to the correct server (and not some other machine
pretending to be the correct server!) and then goes on to agree with that server just how things will be uniquely encoded. The same keys can't be used for multiple connections between different systems, or individual security would be compromised.
See
https protocol - detailed description
How about AJP then? How does that compare to HTTP?
The http protocol is quite expensive in terms of band width - it's an ascii text protocl with words like "POST" and phrases like "Content-type:" taking up more bandwidth than is really needed, and having to be interpreted at destination too. So the
ajp protocol (Apache Java Protocol?) was established to allow for much less expensive exchanges between upstream and downstream servers that are to be closely linked.
ajp carries the same information as http but in a binary format. The request method - GET or POST - is reduced to a single byte, and each of the additional headers are reduced to 2 bytes - typically, that's about a fifth of the size of the http packet.
See
ajp protocol specification for further internal details.
Should I use http, https or ajp?
For most browser to server traffic, use
http. If there's a need for security in the data (or if you're in doubt / customers may question the security), use
https.
Between servers,
http actually works very well - if you have an Apache httpd fronting a number of other servers (be they Apache http or Apache Tomcat), then there's nothing wrong with using the protocol at that layer too. Httpd's mod_proxy and mod_rewrite both allow for forwarding, and server languages such as PHP and Perl can make outgoing requests from the top tier server to other servers using http.
If you're looking to share the load between a number of second level (application) servers from a top level httpd server, mod_proxy_balancer introduced in Apache httpd 2.2 provides you with the tools that you'll need, and mod_rewrite can also do a good load distributions job (although the distribution algorithm is simple). For programs running on the server, outgoing requests can be distributed programatically.
One of the big issues of forwarding to a series of machines to balance the load is making sure that a series of linked pages and data entries called up by the same user are properly co-ordinated ("session continuity" it is called) and both mod_proxy_balancer and mod_rewrite provide the facility to support this. In the case of mod_proxy_balancer, it's a core feature. With mod_rewrite, a clever configuration.
If you have intensive / busy servers with bandwidth issues between them, use
ajp as your linking protocol. The now-excellent mod_jk (available for you to build from the Jakarta project in Apache httpd 2.0 and prior, standard with the httpd distribution from Apache 2.2) provided an excellent use of the protocol, and support in Tomcat is strong. Many commercial systems are using ajp as their transport, and some recent benchmarks I did showed it to be 25% faster that httpd. You, should, though, remember that the transport is only a tiny part of most applications and so the savings are likely to be minimal on a real live system.
See
protocol documents if you want to read further into this.
This is quite a long story, isn't it? If you're setting up multiple servers and sharing resources, you may want to learn the deployment and configuration details. We run several courses that may help you, where you get a chance to set up and try out the various options - see Deploying Apache httpd and Tomcat if you're linking the two servers, or Linux / Unix Web Server if you're configuring / linking multiple copies of httpd. We can also arrange specific private courses for groups, and / or short consultancy sessions. Contact me - graham@wellho.net to talk about your particular needs.
Other Protocols
To help complete the picture - protocols such as
ftp and
rmi transport different types of content, and
xml,
soap and the like are different layers. Again - I can cover that for you if needed!
See also:
Load balancing with mod_jk
Choosing between mod_proxy and mod_rewrite
(written 2008-02-22 21:59:15)
Associated topics are indexed under
A207 - Web Application Deployment - HTTP [2596] Http protocol - what does a web server send - (2010-01-24)
[2321] Uploading and Downloading files - changing names (Perl and PHP) - (2009-08-04)
[1503] Web page (http) error status 405 - (2008-01-12)
[1378] Etag in http headers - what is it? - (2007-10-03)
[484] Setting the file name for a downloaded document - (2005-11-03)
A655 - Web Application Deployment - Using Tomcat and Apache httpd Together [2482] Load balancing with sticky sessions (httpd / Tomcat) - (2009-10-29)
[1944] Forwarding session and cookie requests from httpd to Tomcat - (2008-12-14)
[1940] URL rewriting with front and back servers - (2008-12-14)
[1938] Predictive Load Balancing - PHP and / or Java - (2008-12-13)
[1771] More HowTo diagrams - MySQL, Tomcat and Java - (2008-08-24)
[1552] Extra public classes in deploying Apache httpd and Tomcat - (2008-02-24)
[1383] Monitoring mod_jk and how it is load balancing - (2007-10-07)
[1376] Choosing between mod_proxy and mod_rewrite - (2007-10-02)
[631] Apache httpd to Tomcat - jk v proxy - (2006-03-03)
[576] Why run two different web servers - (2006-01-25)
[436] Linking Apache httpd to Apache Tomcat - (2005-09-05)
H112 - PHP - Further Web Page and Network Handling [2679] How to build a test harness into your PHP - (2010-03-16)
[2632] Shipping a test harness with your class in PHP - (2010-02-12)
[1518] Downloading data for use in Excel (from PHP / MySQL) - (2008-01-25)
[1515] Keeping staff up to date on hotel room status - (2008-01-22)
[1505] Script to present commonly used images - PHP - (2008-01-13)
[1496] PHP / Web 2 logging - (2008-01-06)
[1495] Single login and single threaded models - Java and PHP - (2008-01-04)
[1485] Copyright and theft of images, bandwidth and members. - (2007-12-26)
[1379] Simple page password protection - PHP - (2007-10-04)
[1355] .php or .html extension? Morally Static Pages - (2007-09-17)
[1210] PHP header() function - uses and new restrictions - (2007-05-30)
[1187] Updating a page strictly every minute (PHP, Perl) - (2007-05-14)
[1183] Improving searches - from OR to AND? - (2007-05-11)
[1114] PHP Image upload script - (2007-03-21)
[1009] Passing GET parameters through Apache mod_rewrite - (2006-12-27)
[936] Global, Superglobal, Session variables - scope and persistance in PHP - (2006-11-21)
[904] Of course I'll tell you by email - (2006-10-25)
[847] Image maps for navigation - a straightforward example - (2006-08-28)
[789] Hot answers in PHP - (2006-07-02)
[767] Finding the language preference of a web site visitor - (2006-06-18)
[732] Where is a web site visitor browsing from - (2006-05-24)
[675] Adding PHP tags to an old cgi program - (2006-04-08)
[603] PHP - setting sort order with an associative array - (2006-02-13)
[565] Using PHP to output images, XML, Style sheets, etc - (2006-01-15)
[542] Morning image, afternoon image - (2005-12-26)
[537] Daily Image Santafied - (2005-12-22)
[451] Accessing a page via POST from within a PHP script - (2005-09-26)
[443] Server side scripting of styles to suit the browser - (2005-09-12)
[425] Caching an XML feed - (2005-08-26)
[410] Reading a news or blog feed (RSS) in your PHP page - (2005-08-12)
[376] What brings people to my web site? - (2005-07-13)
[372] Time calculation in PHP - (2005-07-08)
[356] Sudoku helper or sudoku cheat - (2005-06-23)
[345] Spotting a denial of service attack - (2005-06-12)
[314] What language is this written in? - (2005-05-17)
[220] When to use Frames - (2005-02-19)
Some other Articles
Automatic startup and shutdown of TomcatWhich modules are loaded in my Apache httpdJava (JSP and Servlet examples) live on our serverhttp, https and ajp - comparison and choiceHotelympia - show report - the lighter sideNew bathing idea for hotels from HotelympiaThe geometry of East LondonLetting new visitors know we provide training coursesFSB, EGM, AGM.