Once more on the importance of backups

As mentioned before, the best defence against data loss from viruses or hardware damage is to make comprehensive, frequent backups. As such, I propose the following rule of thumb:

If a piece of data is worth more than the drive space it occupies, a second copy should exist somewhere else.

Nowadays, you can easily pick up hard drives for less than $1 per gigabyte. At those prices, it probably isn’t just personal photos and messages that are worth saving, but any bulk data (movies, songs, etc) that would take more than $1 per gigabyte in effort to find and download again.

Mac users should consider downloading Carbon Copy Cloner. It produced bootable byte-for-byte copies of entire drives. That means that even if the hard drive in your computer dies completely and irreplaceably, you can actually run your system off an external hard drive, with all the data and functionality it possessed when you made the most recent copy.

One nice perk about having one or more such copies is how they can let you undo mistakes. If you accidentally erased or corrupted an important file, you can go back and grab it. Likewise, if you installed a software update that proved problematic, you can shift you entire system back to an earlier state.

[Update: 22 January 2010] Since I wrote this article, Apple released new versions of OS X with their excellent Time Machine backup software built-in. I strongly encourage all Mac users to take advantage of it.

Author: Milan

In the spring of 2005, I graduated from the University of British Columbia with a degree in International Relations and a general focus in the area of environmental politics. In the fall of 2005, I began reading for an M.Phil in IR at Wadham College, Oxford. Outside school, I am very interested in photography, writing, and the outdoors. I am writing this blog to keep in touch with friends and family around the world, provide a more personal view of graduate student life in Oxford, and pass on some lessons I've learned here.

14 thoughts on “Once more on the importance of backups”

  1. Speaking of problematic updates…

    What’s Really Broken with Windows Update – Trust
    By CmdrTaco on yer-kidding-me

    Be Cool writes “According to ZDNet, Microsoft has steered itself into a real trust tarpit with Windows Update: ‘See, here’s the problem. To feel comfortable with having an open channel that allows your OS to be updated at the whim of a third party (even/especially* Microsoft … * delete as applicable) requires that the user trusts the third party not to screw around with the system in question. This means no fiddling on the sly, being clear about what the updates do and trying not to release updates that hose systems. While any and all updates have the potential to hose a system, there’s no excuse for hiding the true nature of updates and absolutely no excuse for pushing sneaky updates down the tubes. Over the months vigilant Windows users have caught Microsoft betraying user trust on several separate occasions and this behavior is eroding customer confidence in the entire update mechanism.'”

  2. I’m surprised you didn’t mention Time Machine in the Leopard. A superb feature, since it solves the number 1 problem with backups: users are unlikely to do them.

  3. Paul,

    I haven’t tried Leopard yet, but will surely write something about it once application support forces me to upgrade.

  4. Option 1: Learn not to care about your data. Don’t save any old email, use a film camera, and only listen to physical CDs and not MP3s. If you have no posessions, you have nothing to lose.

    Option 2 goes like this:

    You have a computer. It came with a hard drive in it. Go buy two more drives of the same size or larger. If the drive in your computer is SATA2, get SATA2. If it’s a 2.5″ laptop drive, get two of those. Brand doesn’t matter, but physical measurements and connectors should match.

    Get external enclosures for both of them. The enclosures are under $30.

    Put one of these drives in its enclosure on your desk. Name it something clever like “Backup”. If you are using a Mac, the command you use to back up is this:

    sudo rsync -vaxE –delete –ignore-errors / /Volumes/Backup/

    If you’re using Linux, it’s something a lot like that. If you’re using Windows, go fuck yourself.

    If you have a desktop computer, have this happen every morning at 5AM by creating a temporary text file containing this line:

    0 5 * * * rsync -vaxE –delete –ignore-errors / /Volumes/Backup/

    and then doing sudo crontab -u root that-file

    If you have a laptop, do that before you go to bed. Really. Every night when you plug your laptop in to charge.

    If you’re on a Mac, that backup drive will be bootable. That means that when (WHEN) your internal drive scorches itself, you can just take your backup drive and put it in your computer and go. This is nice.

    When (WHEN) your backup drive goes bad, which you will notice because your last backup failed, replace it immediately. This is your number one priority. Don’t wait until the weekend when you have time, do it now, before you so much as touch your computer again. Do it before goddamned breakfast. The universe tends toward maximum irony. Don’t push it.

    That third drive? Do a backup onto it the same way, then take that to your office and lock it in a desk. Every few months, bring it home, do a backup, and immediately take it away again. This is your “my house burned down” backup.

    “OMG, three drives is so expensive! That sounds like a hassle!” Shut up. I know things. You will listen to me. Do it anyway.

    Update: Mac users: for the backup drive to be bootable, you need to do two things:

    When you partition the drive, use GUID, not Apple Partition Map;

    Get Info on the drive and un-check “Ignore ownership on this drive” under “Ownership and permissions.”

    You can test whether it’s bootable by holding down Option while booting and selecting the external drive.

  5. We must lay out the kinds of failures and goals of a backup to determine how best to back up.

    1. We would like to protect against mechanical drive failure. This can be done with a RAID.

    1.5. We may also want to protect against the failure of other components of the computer. I recently had a computer die because its motherboard died, and it took about two weeks to get a new computer, and the new computer was a significant upgrade so it had SATA instead of IDE. In the mean time, I needed my data on other systems, and when the new computer came, I needed to borrow a USB-IDE bridge to recover some stuff that I wasn’t backing up.

    2. We would like to protect against accidental deletion of files, file corruption, or edits to a file that we have now reconsidered. This can be done with snapshotting. In source code, to reconsider and edit to a file is fairly common, and is the reason why most programming projects use revision control systems. Other options like nilfs or ZFS snapshots can also fill this goal. This goal is accomplished more easily if the backups area automatic and the backup device is live on the system.

    Depending on your needs, this goal may be counterbalanced by a need to not retain the history of files for legal or other reasons, and this should inform your choice of backup strategy.

    3. We would like to protect against filesystem corruption, whether by an OS bug, or by accidentally doing cat /dev/random > /dev/hda. This can be done by having an extra drive of some sort that isn’t normally hooked up to the computer. Tape drives, CDs, and DVDs have traditionally fulfilled this purpose, and this is where the use of additional hard drives is being suggested. Remote backups, via rsync can also accomplish this. For this I use git.

    4. We would like to protect against natural disasters. For someone living in New Orleans, it would be nice to have a backup somewhere outside the path of Hurricane Katrina. Remote backups may be pretty much the only way to accomplish this, unless you’re a frequent traveler and can hand-deliver backup media to remote locations.

    5. In addition to any of the above, the code you use create said backup may be buggy, or may become buggy or misconfigured over time. Checking the integrity and restorability of your backups after creating them, and keeping several (independent) previous versions of a backup may help here.

    You may not be concerned with the various modes of failure described here occuring simultaneously. For example, it may be unlikely that you need to deal with file system corruption at the same time that you regret one of the edits you made on your file. In that case, your offline backup device doesn’t need to hold all of your snapshots.

  6. Worst of all, every stupid cliche about backup that currently makes you roll your eyes in exasperation will be visited upon you tenfold if you’re not using some flavor of the anal-retentive system nerds like John and I live by. Because, unfortunately, most people you know (including me) have already repeatedly been struck by backup’s biggest and most profound cliche:

    Perform automated, redundant, and rotated backups as often as you can afford to lose every single bit of information that’s been changed or added since your last backup. Because it’s going to go away.

    The Holy Trinity

    Seriously:

    * If it’s not automated, it’s not a real backup.
    * If it’s not redundant, it’s not a real backup.
    * If it’s not regularly rotated off-site, it’s not a real backup.

  7. Backup your shit!
    By The Devil Tesla on windows

    “Every hard drive in the world will eventually fail. Assume that yours are all on the cusp of failure at all times.”An Ode to DiskWarrior, SuperDuper, and Dropbox: John Gruber talks about his Mac’s hard drive failing and how he was able to recover all of his data using DiskWarrior, a file recovery utility, SuperDuper!, a backup utility that creates a fully bootable backup, and the file syncing system DropBox. While his advice is Mac specific, you can get a similar system going on Windows with Acronis for backups and one of many free file recovery programs such as TestDisk (which also has a Mac version).
    If you are not willing to spend the money on SuperDuper or Acronis, check out the free backup utilities Carbon Copy Cloner for Mac and Macrium Reflect Free for Windows.

    More and more computer users are not storing files on their home computers and instead using web applications that keep a user’s data safe for them (usually). Web applications are becoming popular enough that Google is creating an OS where only system files are stored on a user’s computer.

  8. Data crisis counsellor

    Data recovery engineer Chris Bross agrees and says if individuals backed up their digital lives “they wouldn’t need us when a failure occurs, and they wouldn’t be in crisis”.

    As digital possessions shrink the need for physical property, data recovery companies like Drive Savers, DTI Recovery and Eco Data Recovery may become the emergency response teams of the future.

    Mr Bross, a Drive Savers employee, believes as individuals grow increasingly dependent on “digital storage technology for holding all these assets that they used to hold more tangibly”, data recovery services will become rather like the firefighters of the 21st Century – responders who save your valuables.

    And like a house fire that rips through a family’s prized possessions, when someone loses their digital goods to a computer crash, they can be devastated.

    Kelly Chessen, a 36-year-old former suicide hotline counsellor with a soothing voice and reassuring personality, is Drive Savers official “data crisis counsellor”.

    Part-psychiatrist and part-tech enthusiast, Ms Chessen’s role is to try to calm people down when they lose their digital possessions to failed drives.

    Ms Chessen says some people have gone as far as to threaten suicide over their lost digital possessions and data.

  9. “It’s rapidly becoming inexcusable for the storage systems we entrust with some of our most precious possessions—something we’re actively encouraged to do by Apple itself—to take such a cavalier approach to data integrity. The worst part is that there’s little a user can do to make up for this technological gap; backups only serve to silently spread data corruption.”

Leave a Reply

Your email address will not be published. Required fields are marked *