Server Maintenance Part 1

Disclaimer: When it comes to my interest in *nix based systems, I have (thanks to Gary S., our resident guru) become a BSD guy. I started off with Linux (Ubuntu, Arch, Fedora, Suse, Gentoo, etc. Tried them all) and when I started working here, he was managing our infrastructure and had it all on FreeBSD. Because I had no choice, I learned it and slowly learned to love it. Over the past year we’ve moved a lot of our company infrastructure systems over to CentOS for compatibility purposes (and that’s why I’m writing this) but a few other systems we use for various means (backups, development, my personal stuff) still run on FreeBSD. Now to the real post!

Since we’re now, like a lot of you, running CentOS on quite a few of our servers I figured it’d be worthwhile to point out some software which can really ease your mind and keep your systems up to date and online. For this post I’ll focus on two which have made me sleep a lot better at night: Ksplice uptrack and Pingdom uptime monitoring.

Ksplice uptrack (http://www.ksplice.com/) is a service which allows you to keep your kernel 100% up to date on the latest patches without having to reboot the system. Typically on any server (Windows Linux or FreeBSD alike) in order to apply system updates, you have to reboot. This effects your overall uptime, causes downtime, and will cost you more money. Some of you are asking “Why would it cost me money to reboot my server?”. Answer: us sysadmins do not like burning the midnight oil for free. Reboots means scheduled maintenance, which 9/10 times means you’re doing it at a time which is convenient for your users and 100% inconvenient for your sysadmins. Get ready to pay them for their extra hours.

In a short, non technical sense, their developers rewrite the updates so they can be applied on the fly. They have extensive documentation on how this works and it’s outside the scope of this post, but if you’re bored one day and feel like reading a long wall of text, enjoy: http://www.ksplice.com/doc/ksplice.pdf

So whats the end result? A fully up to date system, patched from the latest known vulnerabilities, and the peace of mind knowing you’ve got a team of full time developers looking out for you. At this point I’m sure some of you are interested but worried that “well this software is really useful and complex, it must be hard to use!”, right? Wrong. You create an account, install (via your distros package manager) a few programs, add your access key (provided by them) into your Ksplice configuration file, and run two commands and you’re done. You can then login to your Ksplice uptrack account and view all your configured machines, which ones are up to date, which ones need to apply updates, a history of applied updates, and much more.

So what does this all cost? About 2 gallons of gas, or a sandwich at your local deli, or anything that costs close to $4. Ksplice is $3.95 per machine per month, and after 20 machines the pricing gets even better. Even the most financially strapped company should be able to find that in their IT budget 🙂

Secondly, there’s Pingdom uptime monitoring (http://pingdom.com/). Pingdom is a worldwide monitoring service which allows you to “check” your server at regular intervals for uptime from multiple locations all over the globe. If any of the locations see the machine as down, Pingdom will automatically send a notification (via email, SMS, Twitter, or iPhone) notifying you that it’s down.

They offer three plans: free, basic, and business. You can check them all out on their website, but the pricing is very reasonable for what you get. Once you’ve signed into your account, you then create your contacts, which are people you want to be notified when the check is seen as down.

Once your contacts are configured, you then create a check which is essentially a device, a website, etc. You’re given multiple options to monitor, which are HTTP, HTTP Custom, TCP, DNS, Ping, UDP, SMTP, POP3, and IMAP. You then input the IP, URL, or service you wish to monitor, decide who you want to receive the notification, and how frequently you want the “check”, checked. There is an additional options field which provides a more custom approach as well.

Any sysadmin knows getting that dreaded notification that a server has failed brings a gut wrenching feeling to the pit of your stomach. It’s even worse when it comes at 6PM right after you got home from an already exhausting day. Even worse is when it comes at 3AM and wakes you up from your nice dreams of beaches in Belize. You know what the absolute worst is, though? When you don’t get notified at all. Pingdom ensures your sleeping will be disturbed if one of your servers decides it wants some attention.

We’ve implemented both of these technologies in the past year, and I must say it’s been a great experience. We’re utilizing both to their full extent and they’ve really made life a whole lot easier for me. I’ve been woken up a few times now at the wee hours of the morning when we’ve had a system fail, and each time I am I feel good knowing everything is working as expected (once I get over the immediate frustration of something breaking).

Both Pingdom and Ksplice offer 30 day free trials so give it a shot. You can stop both at any time but I have a feeling once you implement them, you’ll never look back. I’ll write a new post in a few weeks with some other software or habits to get into to keep your server online and running in optimal shape!

– Brian