Add a Comment (Go Up to OJB's Blog Page)
Off to a Good Start!
Entry 1614, on 2014-01-03 at 16:22:12 (Rating 1, Computers)
Well 2014 is off to a good start for me... not! My main web server has become rather unstable and crashes randomly, and this has resulted in a corrupted hard disk which in turn resulted in a corrupted clone. Although the services are now running from the clone on a second machine, I do have some limitations and things are running and at a slower speed.
If you read this blog on my main server (ojb.co.nz) you might have noticed some periods of down time and slow access recently. This might continue for another day or two.
I do have another backup (notice how IT pros always have more than one backup) which keeps multiple copies of files, including versions from before the corruption, so it is fully intact, but I would need to reconstruct the server from this by moving files, setting privileges, etc, which is a fairly significant job.
You might think this contradicts my common assertion that Macs are really reliable. Well it does to some extent I guess, but I do have to say that the last time this server was booted (for an update, not a problem) was 6 months ago and it has performed perfectly for years. Also, it is an old machine I scavenged from the recycling for nothing with just 2.5G of RAM, so I think it has done fairly well.
Unfortunately machines of this vintage have a common fault involving the main board becoming unreliable due to oxidation of solder joints and I guess that's what has happened to my server. So to be safe I'll just put a new server in but swapping everything over with minimum down-time is the difficult part of the process.
Still, I was wondering what I would do this weekend. Now I guess I know!
Comment 1 (3814) by OJB on 2014-01-04 at 21:04:26:
You are reading this comment from the new and improved machine. I managed to resurrect my server with minimum downtime and this site (as well as other sites and services on the server) are now running from a newer, faster machine.
Here's how I did it...
The old server had a second hard disk in it which was a clone of the main drive. It was cloned every night using a program called "Carbon Copy Cloner" (highly recommended). When the machine developed a fault and scrambled the main drive I pulled the clone out and booted my spare computer (a similar but older and slower machine) using it, so the web sites were then accessible again in a minute or two.
I then found another old server (similar to the old which failed but faster than the spare one) which I tested and installed a couple of fairly decent hard disks into. I then started this machine in target mode and connected it to the temporary sever using Firewire. Next I cloned the working disk (which was itself a clone) in the temporary server (still working) using CCC again.
I then booted the new machine off-line and verified all of the databases. Then I added any data which had been lost in the brief interval after the clone (almost nothing). Finally I unplugged the network from the temporary server and plugged in the new one. Total down time: a few minutes until I discovered the original problem, and the second it took to unplug the temporary machine and plug in the new one.
It was unfortunate that the original machine failed but I did recover from the problem fairly well. Computers: I both love them and hate them!
You can leave comments about this entry using this form.
To add a comment: enter a name and email (both optional), type the number shown above, enter a comment, then click Add.
Note that you can leave the name blank if you want to remain anonymous.
Enter your email address to receive notifications of replies and updates to this entry.
The comment should appear immediately because the authorisation system is currently inactive.