The techincal "Buck" with our web sites stops with me - and any little admin issues that come up while I'm away on holiday need to be sorted - and of course Murphy's law states that a problem will always happen at the worst possible time. And so it's been with a couple of never-before issues that have arisen during the past week - while we've been on the "Quantum of the Seas" travelling from Bayonne (New Jersey) across the Atlantic - as I write, we're approaching the straights of Gibraltar.
I use our server log graphs as one of our tools to keep an eye on things, together with having each of our servers visiting each other from time to time, and regular (crontab) backup jobs which email me to confirm correct operation (or otherwise). Net result is that my email usually flags up problems within an hour, and I can then see what's up.
First NEW problem was the MySQL server quitting unexpectedly - see [here]
for a full description of how that manifested itself. But that story has moved forward, with the problem recurring two times and the initial fix not working on the second time - the error was solid. Taking a hunch and a hint from the "google hell" that this problem leads to, I wondered if we might have had some sort of memory leak and rebooted the server - and (touch wood) two days later that appears to have fixed the problem.
Second problem - yesterday morning - the regular hourly emails telling me of successful MySQL database backup for the facst changing First Great Western Coffee Shop
stopped arriving. A quick look at the site, all seemed OK. Similarly a quick look at the web server and all seemed to be in order. A report from one of my team that he'd had trouble getting images was followed up by another saying the problem had gone away after half an hour or so.
Our email uses a shared hosting service, where the administrators do a very good continual job of keeping spamassassin databases up to date - a look at our main server the showed spamassassin and procmail going through incoming emails up to a certain time, but then external emails drying up and the only emails being received were from the system there itself. Odd. And I was unable to reach their admin server.
Modern technology is wonderful - in the middle of the Atlantic, a phone call to Hurricane Electric in California and - as ever - the phone is answered without being queued by a techincal person who knows what he's about. And we look at our shared system, the files there and the logs - and he goes (brief hold) for a chat with his admins and tracks some of our emails at their gateways too (telling me all about the emails I should have been getting). I love this company - not the cheapest, but I have yet to find better customer support anywhere. And good that they knew what the problem was within 5 inutes (so the phone call was $$ per minute, I expect, but the overall cost to resolve low) ... problem turns out that we hadn't renewed the domain name and our name registrar (THAT company will remain nameless) appears to have switched "send reminders" off on our account. OOps.
Long story being cut short; I have't a clue as to the password for our name registration account, so resort once again to the phone, and an answer to a secret question. They have me by the "sort and curlies" of course ... but at least the registration was back in place within one to two hours, and pecollated through DNS caching subsequently. Lesson learned, diary note for the start of May 2024 that we need to renew again.
Now that I know what the problems were, and now that the system has bounced back, I can look at the log clues and learn from them ...
• The big spikes during the night are scheduled backups
• The drops to near-zero on 6th and 7th May were when the MySQL server was stopped - and that really shows just how much our site is database drived!
• The drop on the orange (9th) line realates to a reduced traffic while images and other pages weren't being requested because the domain name wasn't being resolved for potential visitors.
Good to see the hard black line for today running very much as I would wish. Take a look at how we generate those graphs [here]
Looking back to our old (and very crude) log file size graph this morning (see [here]
) I note a drop in log file size yesterday ... only to be expected:
Finally, looking at our Google Analytics for the past 10 days - firstly the Wellho.net site with the DNS lost, then the one that makes massive database use:
To a very great extent, these reports are shutting the door after the horse has bolted - they show what damage was done (or rather they give it scale) and they confirm at the next incrememt that the problem appears to be fixed - not ony to us, but also to our worldwide web site visitors, and that's important. (written 2015-05-10)
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articlesA606 - Web Application Deployment - Apache httpd - log files and log tools 
What brings people to my web site? - (2005-07-13) 
What proportion of our web traffic is robots? - (2007-06-19) 
Web page (http) error status 405 - (2008-01-12) 
Every link has two ends - fixing 404s at the recipient - (2008-04-02) 
Be careful of misreading server statistics - (2008-05-28) 
Logging Cookies with the Apache httpd web server - (2008-08-20) 
Server overloading - turns out to be feof in PHP - (2008-09-01) 
libwww-perl and Indy Library in your server logs? - (2008-09-13) 
Logging the performance of the Apache httpd web server - (2010-10-25) 
Apache httpd Server Status - monitoring your server - (2010-10-28) 
Server logs - drawing a graph of gathered data - (2010-11-03) 
Making the most of critical emails - reading behind the scene - (2010-12-16) 
Getting more log information from the Apache http web server - (2011-09-16) 
Needle in a haystack - finding the web server overload - (2011-09-18) 
Who is knocking at your web site door? Are you well set up to deal with allcomers? - (2011-10-21) 
Learning more about our web site - and learning how to learn about yours - (2011-12-17) 
Reading Google Analytics results, based on the relative populations of countries - (2012-03-24) 
TV show appearance - how does it effect your web site? - (2013-01-13) 
20 minutes in to our 15 minutes of fame - (2013-01-20) 
Identifying and clearing denial of service attacks on your Apache server - (2014-09-27) 
Which (virtual) host was visited? Tuning Apache log files, and Python analysis - (2015-01-23)S161 - Data Access and Security in MySQL 
Current MySQL and PHP paths and upgrades - (2005-01-28) 
The wrong MySQL - (2005-01-29) 
What is an SQL injection attack? - (2005-08-02) 
MySQL permissions and privileges - (2005-12-20) 
Checking for MySQL errors - (2006-03-15) 
What is an SQL injection attack? - (2006-11-27) 
MySQL - Password security (authentication protocol) - (2007-04-02) 
Images in a database? How big is a database? (MySQL) - (2009-05-28) 
Mysqldump fails as a cron job - a work around - (2009-06-30) 
Removing duplicates from a MySQL table - (2010-02-22) 
SQL - Data v Metadata, and the various stages of data selection - (2011-04-29) 
Checking MySQL database backups have worked (not failed) - (2015-01-10) 
Fixing damaged MySQL tables - Error 1712 and Error 2013 - (2015-01-25) 
Extracting data from backups to restore selected rows from MySQL tables - (2015-05-01) 
Forgotten / lost MySQL root password - (2015-05-16)A193 - Web Application Deployment - DNS 
Name Services - telling your LDAP from your DNS - (2006-04-16) 
Adding Virtual Hosts - (2006-04-19) 
Faster network, but not faster browsing until ... - (2009-12-14)
Some other Articles
Swindon to Westbury train services - Saturdays from 23rd May 2015Monday to Friday / TransWilts train times from 18 May to 11th December 2015Almost so wrong, but perhaps it's right for some?Web Server Admin - some of those things that happen, and solutionsWhat and where are the Azores?Election results - what if we had a party list system?Election day, 2015Starting MySQL. ERROR! The server quit without updating PID file - how we fixed it.Newark to New York - the PATH suburban railway