Subject: Clustering, using Apache http server (version 2.2.14 in my example) with mod_proxy_balancer as the front load splitter and Apache Tomcat 6.0.20 as the replicated application engine. [[Tip should also work for other recent 2.2.x and 6.0.x versions]]
This is a follow on article from Load balancing with sticky sessions (httpd / Tomcat)
, where I looked at sharing out the application work between a number instances of Tomcat from an Apache http server (httpd) that did the bookkeeping. In a nutshell, the Apache http server sent new arrivals to a 'random' Tomcat, and then used sticky sessions so that - when a visitor came back for their subsequent visit in the same series of accesses - they would always talk to the same Tomcat and could continue their conversation with the server having full knowledge of the position to date.
The balancer alone is a good solution as far as it goes but:
• What happens if the Tomcat that has been stuck to goes out of service?
• What happens if you have such a lot of traffic that you need to replicate your httpd front end?
• What happens if your httpd fails?
• What is you don't actually want to use sessions, but still need what appears to be a single Tomcat?
One possible option to addressing some of these is to use the clustering capability of Tomcat, which I'll describe below. But you should first consider if you really need the extra step:
(a) can I accept that a session will be lost on the rare occasions that a Tomcat goes offline?
(b) is writing to a backend database going to preserve sufficient information anyway?
and if the answer to either is "yes", you probably do NOT need to cluster.
How does clustering work?
You run your web application on a series of identical (or rather "near identical" - the IP address will differ!) servers. With clustering turned on, each of the servers in the cluster is broadcasting (via multicast) any changes made in sessions, cookies, etc to any other listening cluster members on that same multicast address. So that when a visitor comes back for his / her next access, all the machines know what's been going on and can knowledgeably handle the request, even if the original machine isn't available.
You can turn clustering on in Apache Tomcat 6.0.20 simply by uncommenting the line in the default server.xml
file that relates to it:
<Cluster className = "org.apache.catalina.ha.tcp.SimpleTcpCluster"/>
and restating your Tomcat. Older versions of Tomcat (such as 5.5) had a long configuration section listing the ports, replication time, IP addresses to use, trigger files all of which are important but none of which actually needs to be changed from default
in the current release that's the target of this article.
Once you have turned clustering on (yes, it's now that simple), your machines will be communicating ... it's rather like starting a rumor in an office - before you know it, EVERYONE who's around has heard the rumor.
Clustering with the balancer
If you have already implemented balancing with sticky sessions (as covered in the preceeding article
), turning on clustering will cause the data to be shared around. Most of the time the data passed around will not be used - it will ONLY form a backup of the session, to be used if the balancer is unable to reach the sticky machine because it has done down or been taken out of service.
With sticky sessions activated, even a second front-end Apache http server won't cause a switch from one Tomcat to another unless a fail-over occurs, as the jvmroute is a part of the cookie so either (any) of the httpd front ends will correctly forward to the original Tomcat. And if you have an intelligent hardware load balancer, that too will be able to forward consistently and the the clustering will remain merely as a backup.
If you disable sticky sessions
on your balancer, the metrics will change. Forwarding will now be at shared to each of the Tomcats in the balanced group / cluster group (take care that all members of the balance group are included in the cluster!) and so the visitor will get to a differnt back end box each time. But that's now perfectly fine, as they're sharing the data between them so will all know about the originator.
Testing if your cluster is working
Ironically, clustering and balancing is designed to be transparent, so how do you test whether it's working?
My first simple 'trick' is to change the background colour of the pages returned from each cluster member so that "if it's orange it must be Holt" and "if it's blue it must be Chippenham" (our servers are names after local towns and villages!). Going a little further, you can edit your servlet / JSP to return the name of the current host. In Java, the following line:
String myname = InetAddress.getLocalHost().getHostName();
will return you the local name of your computer, so that you can then echo the name.
On last Tuesday's course, I took our sample "Barman"
script that remembers how many drinks you've had in a session (visit counter!) and extended it into a "Pub Watch"
script, where each of the barman communicates with his colleagues in neighboring pubs to keep track of who's out on the town, and how much they have had to drink in each establishment.
If you click on the links in the previous paragraph, you can download the source code for "Barman" and "PubWatch" and try the code out for yourself. Using the balancer manage that I introduced at the end of yesterday's article
, you can open and close individual pubs and see how their customers go elsewhere for their next drink, and you can turn sticky sessions off in the balancer and see how faithful customers will then hit the road and go to a different pub each time for their next drink.
Some notes on clustering
1. The machines in the cluster communicate through multicast, so must be on the same subnet.
2. It's a good idea for the subnet you use to have plenty of capacity if your environment is busy, and for it to be firmly behind a strong firewall from your own company's general user traffic, let alone the Internet
3. If you have multiple Tomcat clusters on the same subnet, you'll need to configure one of the clusters away from the default settings - otherwise they'll end up as being one big cluster (you'll find the word 'tribe' creaping in here!)
At present, we mention
clustering on our public deploying apache httpd and Tomcat course
. Only a small proportion of our delegate want to go 'that far', and for newcomers who hadn't done any web server work when they first came along a couple of days earlier, it would be just too much for the one session.
An extra day on the end of a Tomcat course, coverage in a private course, or a special session set up for the purpose ... all are possible to help you learn how clustering and balancing work. We'll have a network of computers set aside at our training centre for the purpose of setting up a test case, experimenting with configurations, seeing what happens when machines are switched on and off. Something you wouldn't dare so with your own production environment, and might be reluctant to do even on your development of test networks (that's even assuming that you do HAVE multiple machines at the development or test level). (written 2009-10-30, updated 2009-11-11)
Associated topics are indexed underG999 - Well House Consultants - Keynote 
Brown - or Mrs Sally Brown, accountant from Whitstable? - (2011-02-10) 
Faster network, but not faster browsing until ... - (2009-12-14) 
Looking ahead to the Autumn season of training and accommodation - (2009-08-28) 
Learn a new programming language this summer. - (2009-08-06) 
Where do I start when writing a program? - (2009-06-11) 
Past Delegate Offer - Summer Holiday / Weekend Break - (2009-06-03) 
Looking for a career change - Physician to Web Site Designer - (2009-04-28) 
Well House Manor Hotel - on plan for 2009 business guests - (2009-01-03) 
How to avoid duplicating web page maintainance - (2008-12-20) 
2009 - Hotel, Meeting, Training Course prices - (2008-12-07) 
Wiltshire at dawn - the tourist trail - (2008-10-29) 
November and December Public Course Schedule - (2008-10-27) 
The Longest Possible Day - (2008-08-26) 
Hotel room prices - Melksham, Wiltshire - (2008-08-11) 
A warm welcome for visitors from the USA - (2008-06-14) 
PHP training courses every month - (2008-04-18) 
Advanced Python, Perl, PHP and Tcl training courses / classes - (2008-02-25) 
Well House Consultants Training and Hotel - 2008 news - (2008-01-01) 
The Christmas Letter - (2007-12-22) 
What we teach - expained for the non-technical - (2007-10-28) 
Customer feedback - lifeblood of a business - (2007-08-25) 
2008 course schedule - Perl, Python, PHP, Linux, Java Deployment, Ruby and more - (2007-08-23) 
Ensuring that our tutor answers YOUR questions - (2007-06-25) 
A contrast in room rates and facilities - (2007-04-25) 
Wilts and Berks - two conference / training rooms in Melksham - (2007-02-14) 
Presenting Melksham - for a weekend away in Wiltshire - (2006-11-17) 
Courses at Well House Manor - (2006-10-19) 
Open - (2006-10-09) 
Course scheduling and Geekmas - are they traditions yet? - (2006-08-26) 
Bed and Breakfast, or Hotel? - (2006-06-22) 
In the hospitality business - (2006-04-23) 
Not just a pretty face to answer the phone. - (2006-02-26) 
2005 - Come as a student, leave as a friend - (2005-12-31) 
Technical Loneliness - (2005-10-14) 
Is enough enough? - (2005-09-27) 
Swindon - Chippenham - Melksham - Trowbridge - Westbury train service - (2005-08-29) 
Most popular courses - (2005-08-19) 
Spring turns to Summer - (2005-06-26) 
Pricing strategy - simple and fair - (2005-04-29) 
10 years and counting - (2005-02-23) 
Well House Consultants - review of 2004 - (2004-12-31)A657 - Web Application Deployment - Using httpd to front Tomcat 
Handling failures / absences of your backend server nicely - (2013-02-08) 
Tuning Apache httpd and Tomcat to work well together - (2010-10-27) 
Forwarding session and cookie requests from httpd to Tomcat - (2008-12-14) 
URL rewriting with front and back servers - (2008-12-14) 
mod_proxy_ajp and mod_proxy_balancer examples - (2008-12-13) 
Strange behaviour of web directory requests without a trailing slash - (2008-03-06) 
Load Balancing with Apache mod_jk (httpd/Tomcat) - (2007-10-02) 
Apache httpd to Tomcat - jk v proxy - (2006-03-03) 
Why run two different web servers - (2006-01-25)A690 - Web Application Deployment - Clustering and load balancing 
Distributed, Balanced and Clustered Load Sharing - the difference - (2012-10-13) 
Simplest ever proxy configuration? - (2011-06-28) 
Distributing the server load - yet ensuring that each user return to the same system (Apache httpd and Tomcat) - (2011-05-18) 
Load balancing with sticky sessions (httpd / Tomcat) - (2009-10-29) 
Sharing the load between servers - httpd and Tomcat - (2009-02-28) 
Automated server heartbeat and health check - (2009-01-16) 
Load Balancing - Hardware or Software? - (2009-01-15) 
More HowTo diagrams - MySQL, Tomcat and Java - (2008-08-24) 
Sharing the load with Apache httpd and perhaps Tomcat - (2007-03-29) 
Clustering, load balancing, mod_rewrite and mod_proxy - (2006-11-21)
Some other Articles
Melksham ForumSanta Special - Trowbridge and Melksham to SwindonHow do I set up a constant in Python?Finding text and what surrounds it - contextual grepClustering on TomcatSample code with errors in it on our web sitePantomimes around Melksham - 2009/2010 seasonAccidentally typed ci rather than vi?How did I do THAT?