Backups and archives

Anyone who uses a computer needs to understand whether or how their computer is backed up. I like to say that people's default expectation is perfection, and especially at work people often think their computers and network drives are magically backed up to the latest second, even if there were a catastrophe.

Backups vs. archives

I have found it extremely helpful to distinguish between backups and archives.

These are for restoring files in the case of a disaster, such as a coffee spill.
These restore you to a certain point of time, such as February 1.

Google Docs is a great example of a tool with archives built-in: you can easily see earlier versions of a file.

I used to be jazzed about Apple's Time Machine; theoretically it should work as well as Google Docs' archiving. However, then I tried to restore. Major props to James Pond's Time Machine Troubleshooting web site, which I attempted to use to diagnose many critical errors with Time Machine. My experience (which was a few years ago) was that Time Machine's backup would get corrupted and then you silently wouldn't be backed up for months at a time.

More generally, if you only have a backup and not an archive you realize that you accidentally altered/destroyed a file yesterday, you may not be able to recover that file. Archives also help you restore from ransomware or other damage that has slowly occurred to your system.

Levels of backup

Ideally backups can protect you from

single component failure
one thing (e.g. a hard drive) fails; and ideally also
single machine failure
an entire machine (perhaps with several hard drives) fails; and ideally also
single site failure
an entire location fails (e.g. your house); and ideally also
single region failure
a big disaster such as a hurricane, where potentially you can't rely on any technology being salvageable.

Simple backups

Backups and archives are incredibly complicated to set up. Even "simple" systems such as dragging and dropping your hard drive to an external hard drive are prone to failure.

Golden rule of backups: you don't have a backup unless you have tested it–that is, you've successfully restored from a backup.

For example, if you drag and drop your hard drive you may be copying only a link to the hard drive, or the copy may skip any "busy" files such as your email, or there could be a thousand other complications.

I recommend Arq

I am a really big fan of Arq backup. I am not sponsored or paid to say this. Arq is an excellent tool. It is relatively simple to set up. You pay a one-time software license and then you're good to go.

Arq's job is to back up the folders that you tell it to back up, every hour (or whatever time frame you specify), to a cloud storage system of your choice. It can back up for example to Amazon S3 (complicated to set up), Amazon Drive (easy to set up), Google Drive, Dropbox, or many other cloud services. Arq does not provide the backup storage–Arq is a program that compares what's changed, compresses the changes, encrypts the changes, and sends/receive these changes with a cloud system.

Arq gives you both backups and archives. It immediately protects you from single component, single machine, and single site failures. Depending on the cloud service you use, it potentially protects you from single region failures as well.

Arq can purge old archives based on settings you control, e.g. based on how much you are willing to pay for your S3 bucket.

I have successfully restored my computer before with Arq. Unfortunately, I was using Amazon Glacier, and I didn't read the caveats in Amazon Glacier that you can only restore like 10% of your storage a day–the restore took two weeks or so. But, it happened, and Arq facilitated the restore. I have also used Arq to restore a Minecraft file to the previous hour when I accidentally summoned the Ender Dragon and destroyed our family's world.

See also tarsnap

For UNIX-based systems, I have heard great things about tarsnap. I've only barely used it myself, though, so I can't vouch for it.

Make sure you have (at least) a backup

If you don't have a backup tool please get an Arq trial and subscribe to a cloud service such as Amazon Drive (currently $60/yr for 1TB). You can find a better option later, but you really need at least a backup.

If you have a backup system, please check it now. Can you restore a file that you've modified within the last week? Do not trust any of your backups if you haven't restored from them. I just tested a restore in Arq from a file modified Feb 9, and it worked OK.