Monday, January 25, 2010

Backup! Backup! Backup! And then backup your backups.

Eating right. Flossing. Exercising regularly. Vitamins. Saving for retirement. Drinking 8 glasses of water a day. Regular backups.

Pick up on a theme yet?

I currently use Hyper-V, and very much appreciate the flexibility and efficiency it provides, allowing me to host over 25 virtual servers for various clients and projects on just a few physical servers.

But, now that I have a veritable virtual server farm, I have the corresponding challenge of backing up all of those virtual servers.

This weekend a client had an issue with a custom GL Clearing Entry Integration that I developed for them. To test the issue, I tried closing the fiscal year in GP on my server to confirm how the clearing entries behave against a closed year. Well, let's just say that I didn't have a backup of the test database on my virtual server before I closed 2009. Ooops. Classic mistake. (And good time to plug Atul Gawande's new book: The Checklist Manifesto)

Here is some background on my backup regimen. I suspect it sounds neurotic, but if you've lost an entire hard drive or had that sinking feeling in your gut when you knew you lost critical client files or data, you'll understand.

First, I backup my project files to a zip file every day that I make changes. So SQL scripts, package files, VBA, Visual Studio projects, source files, and any project documents that I create on a virtual server get backed up to a date-time stamped zip file on that virtual server. But wait, there's more!

That zip file then gets copied to my main workstation. This triggers an automatic backup to "the cloud" via both SugarSync and Carbonite. SugarSync, in addition to maintaining the last 5 versions of all of my files, also automatically synchronizes the files over to my laptop. But that's not all! Don't touch that dial!

I then have a scheduled batch file that uses RoboCopy to backup my workstation files to my 4TB TeraStation on a weekly basis. I stopped trying to count how many backups that makes.

That sounds fine, right? Very much a "belt and suspenders" approach ready for most natural disasters.

But that's the easy part. Those data files constitute less than 10GB of data.

So, what about all of those massive VHD files on my Hyper-V servers? If I forget to backup some SQL script or table or other database object, the VHD is my only backup.

For backing up VHDs, I currently lack an elegant solution. (Although I'm working on one.)

How in the world do you make regular backups of multiple VHD files that are anywhere from 15GB to 50GB each? The files do compress pretty well using WinRAR, but it can take several hours to compress and copy each one, which can be 5GB to 10GB each.

I'm not willing to spend thousands of dollars on backup software or dedicated backup hardware. Those bring entirely new complexity and headaches to the party.

So, at the moment, my routine is to WinRAR a VHD file or two to my TeraStation in the evening. It takes anywhere from 1 hour to 4 hours, so it works when I remember to do it. But naturally I don't remember to do this every evening, and I don't currently track which VHDs I've backed up and when, and which ones need to be backed up vs. those that have not changed in a few months.

Fortunately, for the Clearing Entry issue I mentioned earlier, I did have a backup of the VHD from December, which I intentionally made before we started the Clearing Entry import project. So I believe I'll be able to restore my GP databases from that older VHD.

But that VHD is over a month old, so clearly I got pretty lucky.

My idea at this point is to use the command line version of WinRAR (just called RAR), to create scheduled jobs that will backup the VHDs to my TeraStation each evening.

A few tedious elements to this. Two that come to mind are:

1) I'll want to skip the backup if the VHD hasn't been changed in the last X days

2) I'll want to shut down the virtual server if it is running when the backup starts

I started on a basic command line script to automate the WinRAR process. Next I'll need to refresh my knowledge of the scripting required to detect the virtual server state and shut it down if it is running. I last did with Virtual Server 2005 a few years ago, so I'll need to learn about any differences for Hyper-V. And then I'll need to figure out a schedule to coordinate the backup of my different VHDs so that only a few are done at a time, as the TeraStation is pretty slow for large file copies (don't get me started).

I know there are probably dozens of possible approaches, but I'll start with this to work out the kinks and learn. And ultimately I'll need to figure out a process to purge the VHD backups, since I already have 800GB of compressed VHD files on my TeraStation.

Anyway, this is yet another lesson, and reminder, that you need to make backups, and then, ideally, make backups of your backups.

And with that, I leave you with a favorite computer geek joke.

1 comment:

Bruce Nelson said...

Good article. This certainly addresses the critical nature of backing up data, files, systems etc. As a Dynamics GP partner with a substantial install base of payroll users we also had to address the issue of redundancy. Its one thing to have a solid backup it's another to be able to restore the entire system on short notice like Payroll Day.

We offer our clients an on-site hardware solution that can be virtualized to run as the server should the equipment actually fail. It addresses backups; offsite storage and disaster recover without human intervention. It's unfortunate that some organizations don't understand the importance of backup and disaster recovery solutions until it too late.

Here is a link with a little more detail on the solution.

Thanks for keeping the topic in front of your readers.

Bruce Nelson, President
Vertical Solutions