Saturday, September 3, 2016

Backing up terabytes of data off site

By Steve Endow

A few years ago I was backing up my files, photos, music, code, and massive Hyper-V VHDs to an old windows file server.  That server finally started dying, so I bought a Synology DS1813+ with 12TB of usable storage (RAID 10) and transitioned all of my photos, music, video, system backups, and VHD backups to the new NAS.

TONS of space.  That'll take me a long time to fill up.  Famous last words.

In about a year, I had filled 80%.  Admittedly, I lack a coherent backup and archive retention strategy, so I could probably have freed up some space, but in using the Synology, I realized two things.

1. I have a pretty massive amount of data stored centrally on my NAS
2. I don't have a backup for my NAS

When I first setup the Synology, I had tried doing backups from the Synology to two external USB drives, and rotating those external drives with two other external drives weekly.  But swapping out drives every week gets really old.  And those 2TB drives filled up quickly, requiring yet another layer of storage to manage.

I wanted a better "archive" solution for the data on my NAS.

The obvious solution was... to buy another bigger NAS!

What better way to backup a NAS than to back it up to another NAS?  The logic is genius.

But there were a few benefits.  Not long after I purchased the DS1813+, Synology released the DS1815+.  One improvement was a significant increase in CPU performance.  While you don't typically shop for a NAS based on CPU performance, it does really matter.  RAID array builds, consistency checks, heavy traffic, multiple tasks or doing any backup file encryption can peg the CPU, so MOAR CPU power is a good thing.

As an aside, here is the resource utilization of both while the 1815 is backing up data to the 1813.



So I got the DS1815+ setup with 21TB of usable storage, which is still very spacious, even after transferring all of my data off the 1813+.  I then wiped my old DS1813+ and repartioned it to provide 18TB of usable storage.  I installed the Synology HyperBackup and HyperBackup Vault on the two units and setup a job to backup data from the 1815 to the 1813.  That all went pretty smoothly.

The plan was then to setup the DS1813+ off site with a friend or family member at least 20 miles away, and configure the backup job to run over the internet.  The reason I went with this approach is that cloud backup solutions at that time were still moderately expensive for large amounts of storage. I priced out the cloud backup providers that were supported by the Synology backup software, and if you needed to backup TB of data, the fees easily justified getting a second NAS.

Then summer came around and then I was on vacation for a month, so things were put on hold.

Not too longer after I got back from vacation, Synology released a new version of DSM that added support for Backblaze B2.  B2 is an online data storage service that is incredibly inexpensive--basically $5 per month per TB.  So I could selectively backup 5TB of data for just $25 a month.  And that data would be stored in a Backblaze data center, not at a friend's house or business.  Yes, there are some storage options that can compete with B2 on price, like Amazon Glacier, but I didn't want to deal with the caveats that come with that type of storage--even if they might be more economical long term.

For some reason Synology only supports Backblaze through its Cloud Sync app, not through HyperBackup.

So I signed up with Backblaze B2, configured the Synology Cloud Sync app, and slowly added additional directories to the B2 backup job.  At the moment, I have about 1TB backed up to Backblaze, with all files encrypted by the Synology prior to upload.

I received an email from Backblaze for my monthly charges of a whopping $3.85.  Next month, if my usage is still 1TB, it should be about $5.

I still have the DS1813+, so I may go ahead and set it up off site and share it with friends, allowing them to do off site backup, but for the moment, it looks like backing up to Backblaze should work for me as an inexpensive off site solution for several TB of data.

You can also find him on Google+ and Twitter

No comments: