Subject: Clustering, using Apache http server (version 2.2.14 in my example) with mod_proxy_balancer as the front load splitter and Apache Tomcat 6.0.20 as the replicated application engine. [[Tip should also work for other recent 2.2.x and 6.0.x versions]]
This is a follow on article from Load balancing with sticky sessions (httpd / Tomcat)
, where I looked at sharing out the application work between a number instances of Tomcat from an Apache http server (httpd) that did the bookkeeping. In a nutshell, the Apache http server sent new arrivals to a 'random' Tomcat, and then used sticky sessions so that - when a visitor came back for their subsequent visit in the same series of accesses - they would always talk to the same Tomcat and could continue their conversation with the server having full knowledge of the position to date.
The balancer alone is a good solution as far as it goes but:
• What happens if the Tomcat that has been stuck to goes out of service?
• What happens if you have such a lot of traffic that you need to replicate your httpd front end?
• What happens if your httpd fails?
• What is you don't actually want to use sessions, but still need what appears to be a single Tomcat?
One possible option to addressing some of these is to use the clustering capability of Tomcat, which I'll describe below. But you should first consider if you really need the extra step:
(a) can I accept that a session will be lost on the rare occasions that a Tomcat goes offline?
(b) is writing to a backend database going to preserve sufficient information anyway?
and if the answer to either is "yes", you probably do NOT need to cluster.
How does clustering work?
You run your web application on a series of identical (or rather "near identical" - the IP address will differ!) servers. With clustering turned on, each of the servers in the cluster is broadcasting (via multicast) any changes made in sessions, cookies, etc to any other listening cluster members on that same multicast address. So that when a visitor comes back for his / her next access, all the machines know what's been going on and can knowledgeably handle the request, even if the original machine isn't available.
You can turn clustering on in Apache Tomcat 6.0.20 simply by uncommenting the line in the default server.xml
file that relates to it:
<Cluster className = "org.apache.catalina.ha.tcp.SimpleTcpCluster"/>
and restating your Tomcat. Older versions of Tomcat (such as 5.5) had a long configuration section listing the ports, replication time, IP addresses to use, trigger files all of which are important but none of which actually needs to be changed from default
in the current release that's the target of this article.
Once you have turned clustering on (yes, it's now that simple), your machines will be communicating ... it's rather like starting a rumor in an office - before you know it, EVERYONE who's around has heard the rumor.
Clustering with the balancer
If you have already implemented balancing with sticky sessions (as covered in the preceeding article
), turning on clustering will cause the data to be shared around. Most of the time the data passed around will not be used - it will ONLY form a backup of the session, to be used if the balancer is unable to reach the sticky machine because it has done down or been taken out of service.
With sticky sessions activated, even a second front-end Apache http server won't cause a switch from one Tomcat to another unless a fail-over occurs, as the jvmroute is a part of the cookie so either (any) of the httpd front ends will correctly forward to the original Tomcat. And if you have an intelligent hardware load balancer, that too will be able to forward consistently and the the clustering will remain merely as a backup.
If you disable sticky sessions
on your balancer, the metrics will change. Forwarding will now be at shared to each of the Tomcats in the balanced group / cluster group (take care that all members of the balance group are included in the cluster!) and so the visitor will get to a differnt back end box each time. But that's now perfectly fine, as they're sharing the data between them so will all know about the originator.
Testing if your cluster is working
Ironically, clustering and balancing is designed to be transparent, so how do you test whether it's working?
My first simple 'trick' is to change the background colour of the pages returned from each cluster member so that "if it's orange it must be Holt" and "if it's blue it must be Chippenham" (our servers are names after local towns and villages!). Going a little further, you can edit your servlet / JSP to return the name of the current host. In Java, the following line:
String myname = InetAddress.getLocalHost().getHostName();
will return you the local name of your computer, so that you can then echo the name.
On last Tuesday's course, I took our sample "Barman"
script that remembers how many drinks you've had in a session (visit counter!) and extended it into a "Pub Watch"
script, where each of the barman communicates with his colleagues in neighboring pubs to keep track of who's out on the town, and how much they have had to drink in each establishment.
If you click on the links in the previous paragraph, you can download the source code for "Barman" and "PubWatch" and try the code out for yourself. Using the balancer manage that I introduced at the end of yesterday's article
, you can open and close individual pubs and see how their customers go elsewhere for their next drink, and you can turn sticky sessions off in the balancer and see how faithful customers will then hit the road and go to a different pub each time for their next drink.
Some notes on clustering
1. The machines in the cluster communicate through multicast, so must be on the same subnet.
2. It's a good idea for the subnet you use to have plenty of capacity if your environment is busy, and for it to be firmly behind a strong firewall from your own company's general user traffic, let alone the Internet
3. If you have multiple Tomcat clusters on the same subnet, you'll need to configure one of the clusters away from the default settings - otherwise they'll end up as being one big cluster (you'll find the word 'tribe' creaping in here!)
At present, we mention
clustering on our public deploying apache httpd and Tomcat course
. Only a small proportion of our delegate want to go 'that far', and for newcomers who hadn't done any web server work when they first came along a couple of days earlier, it would be just too much for the one session.
An extra day on the end of a Tomcat course, coverage in a private course, or a special session set up for the purpose ... all are possible to help you learn how clustering and balancing work. We'll have a network of computers set aside at our training centre for the purpose of setting up a test case, experimenting with configurations, seeing what happens when machines are switched on and off. Something you wouldn't dare so with your own production environment, and might be reluctant to do even on your development of test networks (that's even assuming that you do HAVE multiple machines at the development or test level). (written 2009-10-30, updated 2009-11-11)
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articlesG999 - Well House Consultants - Keynote 
Well House Consultants - review of 2004 - (2004-12-31) 
10 years and counting - (2005-02-23) 
Pricing strategy - simple and fair - (2005-04-29) 
Spring turns to Summer - (2005-06-26) 
Most popular courses - (2005-08-19) 
Swindon - Chippenham - Melksham - Trowbridge - Westbury train service - (2005-08-29) 
Is enough enough? - (2005-09-27) 
Technical Loneliness - (2005-10-14) 
2005 - Come as a student, leave as a friend - (2005-12-31) 
Not just a pretty face to answer the phone. - (2006-02-26) 
In the hospitality business - (2006-04-23) 
Bed and Breakfast, or Hotel? - (2006-06-22) 
Course scheduling and Geekmas - are they traditions yet? - (2006-08-26) 
Open - (2006-10-09) 
Courses at Well House Manor - (2006-10-19) 
Presenting Melksham - for a weekend away in Wiltshire - (2006-11-17) 
Wilts and Berks - two conference / training rooms in Melksham - (2007-02-14) 
A contrast in room rates and facilities - (2007-04-25) 
Ensuring that our tutor answers YOUR questions - (2007-06-25) 
2008 course schedule - Perl, Python, PHP, Linux, Java Deployment, Ruby and more - (2007-08-23) 
Customer feedback - lifeblood of a business - (2007-08-25) 
What we teach - expained for the non-technical - (2007-10-28) 
The Christmas Letter - (2007-12-22) 
Well House Consultants Training and Hotel - 2008 news - (2008-01-01) 
Advanced Python, Perl, PHP and Tcl training courses / classes - (2008-02-25) 
PHP training courses every month - (2008-04-18) 
A warm welcome for visitors from the USA - (2008-06-14) 
Hotel room prices - Melksham, Wiltshire - (2008-08-11) 
The Longest Possible Day - (2008-08-26) 
November and December Public Course Schedule - (2008-10-27) 
Wiltshire at dawn - the tourist trail - (2008-10-29) 
2009 - Hotel, Meeting, Training Course prices - (2008-12-07) 
How to avoid duplicating web page maintainance - (2008-12-20) 
Well House Manor Hotel - on plan for 2009 business guests - (2009-01-03) 
Looking for a career change - Physician to Web Site Designer - (2009-04-28) 
Past Delegate Offer - Summer Holiday / Weekend Break - (2009-06-03) 
Where do I start when writing a program? - (2009-06-11) 
Learn a new programming language this summer. - (2009-08-06) 
Looking ahead to the Autumn season of training and accommodation - (2009-08-28) 
Faster network, but not faster browsing until ... - (2009-12-14) 
Brown - or Mrs Sally Brown, accountant from Whitstable? - (2011-02-10) 
What teach you in a week stays with you for a decade - (2015-11-29)A690 - Web Application Deployment - Clustering and load balancing 
Clustering, load balancing, mod_rewrite and mod_proxy - (2006-11-21) 
Sharing the load with Apache httpd and perhaps Tomcat - (2007-03-29) 
More HowTo diagrams - MySQL, Tomcat and Java - (2008-08-24) 
Load Balancing - Hardware or Software? - (2009-01-15) 
Automated server heartbeat and health check - (2009-01-16) 
Sharing the load between servers - httpd and Tomcat - (2009-02-28) 
Load balancing with sticky sessions (httpd / Tomcat) - (2009-10-29) 
Distributing the server load - yet ensuring that each user return to the same system (Apache httpd and Tomcat) - (2011-05-18) 
Simplest ever proxy configuration? - (2011-06-28) 
Distributed, Balanced and Clustered Load Sharing - the difference - (2012-10-13) 
Java web application for teaching - now with sessions and clustering / load balancing demonstrations - (2015-02-20)A657 - Web Application Deployment - Using httpd to front Tomcat 
Why run two different web servers - (2006-01-25) 
Apache httpd to Tomcat - jk v proxy - (2006-03-03) 
Load Balancing with Apache mod_jk (httpd/Tomcat) - (2007-10-02) 
Strange behaviour of web directory requests without a trailing slash - (2008-03-06) 
mod_proxy_ajp and mod_proxy_balancer examples - (2008-12-13) 
URL rewriting with front and back servers - (2008-12-14) 
Forwarding session and cookie requests from httpd to Tomcat - (2008-12-14) 
Tuning Apache httpd and Tomcat to work well together - (2010-10-27) 
Handling failures / absences of your backend server nicely - (2013-02-08)
Some other Articles
Melksham ForumSanta Special - Trowbridge and Melksham to SwindonHow do I set up a constant in Python?Finding text and what surrounds it - contextual grepClustering on TomcatSample code with errors in it on our web sitePantomimes around Melksham - 2009/2010 seasonAccidentally typed ci rather than vi?How did I do THAT?