Recently, I’ve blogged about full-disk image backup of Windows using Clonezilla. I’ve also blogged about the excellent “trust no one” (thanks, Steve Gibson, for this term) CloudBerry file-oriented backup utility.
These things help you create a backup. But where should you store the backup? How secure is that backup if you store it in the cloud? And, what does it cost to keep the backup in the cloud for the long-term?
Costs can grow rapidly because stuff just piles up. I mean, who ever goes back and prunes backups to remove dross you don’t need or want? So, over time if you are paying for storage by the gigabyte, eventually, you start spending real money.
“All you can store” services like Carbonite have limitations that make them unattractive for very long-term storage. First, consumer versions of most backup services limit you to a single machine and don’t support network drives. Worse, they don’t allow you to do what Gibson calls “PIE” or pre-Internet encryption. PIE means that only you have a key that can decrypt the files stored in the cloud. If a service, like Carbonite or SkyDrive, offers web-based access to your data, that means they have the key — and it can be given to a government agency on demand or stolen by hackers.
Amazon’s S3 allowed PIE via front-end utilities like CloudBerry and became more affordable over time, especially since Amazon stopped charging for upload bandwidth. But S3 could get expensive if you kept every file on your system for the long-term — and you had multiple machines and network drives to backup in the cloud.
Now, Amazon Web Services is offering an excellent long-term storage solution at an attractive price. Amazon Glacier is dirt-cheap, offline archival storage. In the lineup of storage classes, you have online storage (your hard disk), near-online (like S3 or SkyDrive) and offline (like tape used to be). The trade-off of each type is that as costs go down, access time goes up. So, for offline storage on AWS we’d expect slow retrieval times in exchange for very low long-term cost.
And that’s just what Glacier does. 100GB of storage should cost about $1 a month. The downsides? It can take up to four hours to retrieve data. Another issue is that Glacier charges quite a lot for what it calls “early deletes” or data that has been stored for less than three months. Both trade-offs are absolutely fine with me — if I need a file that’s not in a near-online cloud provider because I last accessed it, say, three years ago I can certainly wait for a few hours to retrieve it. And at $.01/GB/month, I can afford backups of everything. For example, I can afford to keep every version of a photo I create along with the original camera raw files. Or, if I rip a CD to FLAC and then compress to MP3 for an iPhone, I can keep the MP3s and the FLAC versions. And, to avoid early deletes, I just wait 91 days after the file was written to delete it if I don’t need it anymore.
Geek fans of AWS know that most of its services are UI-less. However, within days of a new AWS service appearing like Glacier, existing tools are updated to take advantage of new AWS capabilities and new tools appear. I am very pleased that switching from S3 to Glacier in CloudBerry backup was a simple matter of downloading their new client and switching the target storage from S3 to Glacier. And I found another tool, FastGlacier, which gives the user a UI for accessing some of the Glacier functionality that’s not in CloudBerry, like requesting an inventory or comparing a vault to a local filesystem. FastGlacier is free for non-commercial use.
So, it’s now possible to get the kind of cold storage for files that really does allow one to archive everything more or less permanently (at least as long as AWS offers the service) at a very attractive price and without having to be selective in what is archived.