PSA: I had a hard drive die on me. Recovered all my data with about 25 hours of work all up for two people working together.
Looking back on it I doubt many things could have convinced me to improve my backup systems; short of working in the cloud; my best possible backups would have probably lost the last two weeks of work at least.
I am taking suggestions for best practice; but also a shout out to backups, and given it’s now a new year, you might want to back up everything before 2016 right now. Then work on a solid backing up system.
(Either that or always keep 25 hours on hand to manually perform a ddrescue process on separate sectors of a drive; unplugging and replugging it in between each read till you get as much data as possible out, up until 5am for a few nights trying to scrape back the entropy from the bits...) I firmly believe with the right automated system it would take less than 25 hours of effort to maintain.
bonus question: what would convince you to make a backup of your data?
Use a backup system that automatically backs up your data, and then nags at you if the backup fails. Test to make sure that it works.
For people who don’t want / can’t run their own, I’ve found that Crashplan is a decent one. It’s free, if you only back up to other computers you own (or other peoples’ computers); in my case I’ve got one server in Norway and one in Ireland. There have, however, been some doubts about Crashplan’s correctness in the past.
There are also about half a dozen other good ones.
Google for ‘crashplan data loss’, and you’ll find a few anecdotes. The plural of which isn’t “data”, but it’s enough to ensure that I wouldn’t use it for my own important data if I wasn’t running two backup servers of my own for it. Even then, I’m also replicating with Unison to a ZFS filesystem that has auto-snapshots enabled. In fact, my Crashplan backups are on the same ZFS setup (two machines, two different countries), so I should be covered against corruption there as well.
Suffice to say, I’ve been burnt in the past. That seems to be the only way that anyone ever starts spending this much (that is, ‘sufficient’) effort on backups.
I’m paranoid. I wouldn’t trust a single backup service, even if it had never had any problems; I’d be wondering what they were covering up, or if they were so small, they’d likely go away.
Use RAID on ZFS. RAID is not a backup solution, but with the proper RAIDZ6 configuration will protect you against common hard drive failure scenarios. Put all your files on ZFS. I use a dedicated FreeNAS file server for my home storage. Once everything you have is on ZFS, turn on snapshotting. I have my NAS configured to take a snapshot every hour during the day (set to expire in a week), and one snapshot on Monday which lasts 18 months. The short lived snapshots lets me quickly recover from brain snafus like overwriting a file.
Long lived snapshotting is amazing. Once you have filesystem snapshots, incremental backups become trivial. I have two portable hard drives, one onsite and one offsite. I plug in the hard drive, issue one command, and a few minutes later, I’ve copied the incremental snapshot to my offline drive. My backup hard drives become append only logs of my state. ZFS also lets you configure a drive so that it stores copies of data twice, so I have that turned on just to protect against the remote chance of random bitflips on the drive.
I do this monthly, and it only burns about 10 minutes a month. However, this isn’t automated. If you’re willing to trust the cloud, you could improve this and make it entirely automated with something like rsync.net’s ZFS snapshot support. I think other cloud providers also offer snapshotting now, too.
I feel that this is too complicated a solution for most people to follow. And it’s not a very secure backup system anyway.
You can just get an external hard drive and use any of the commonly-available full-drive backup software. Duplicity is a free one and it has GUI frontends that are basically just click-to-backup. You can also set them up to give you weekly reminders, etc.
Generally speaking, the best practice is to have two separate backups, one of them offsite.
First, you might want to run some kind of a RAID setup so that a single disk failure doesn’t affect much. RAID is not backup, but it’s useful.
Second, you might want to set up some automated backup/copy of your data to a different machine or to a cloud. The advantage is that it’s setup-and-forget. The disadvantage is that if you have data corruption or malware, etc. the corrupted data could overwrite your clean backup before you notice something is wrong. Because of that it would not be a bad idea to occasionally make known-clean copies of data (say, after a disk check and a malware check) on some offline media like a flash drive or an external hard drive.
Disk space is really REALLY cheap. It’s not rational :-/ to skimp on it.
I consider either tresoit or megaupload currently to be the best ways to backup data automatically both provide clientside encryption. The free version of megaupload allows for 50GB.
PSA: I had a hard drive die on me. Recovered all my data with about 25 hours of work all up for two people working together.
Looking back on it I doubt many things could have convinced me to improve my backup systems; short of working in the cloud; my best possible backups would have probably lost the last two weeks of work at least.
I am taking suggestions for best practice; but also a shout out to backups, and given it’s now a new year, you might want to back up everything before 2016 right now. Then work on a solid backing up system.
(Either that or always keep 25 hours on hand to manually perform a ddrescue process on separate sectors of a drive; unplugging and replugging it in between each read till you get as much data as possible out, up until 5am for a few nights trying to scrape back the entropy from the bits...) I firmly believe with the right automated system it would take less than 25 hours of effort to maintain.
bonus question: what would convince you to make a backup of your data?
Use a backup system that automatically backs up your data, and then nags at you if the backup fails. Test to make sure that it works.
For people who don’t want / can’t run their own, I’ve found that Crashplan is a decent one. It’s free, if you only back up to other computers you own (or other peoples’ computers); in my case I’ve got one server in Norway and one in Ireland. There have, however, been some doubts about Crashplan’s correctness in the past.
There are also about half a dozen other good ones.
Links? I use Crashplan and would be interested in learning about its bugs.
Google for ‘crashplan data loss’, and you’ll find a few anecdotes. The plural of which isn’t “data”, but it’s enough to ensure that I wouldn’t use it for my own important data if I wasn’t running two backup servers of my own for it. Even then, I’m also replicating with Unison to a ZFS filesystem that has auto-snapshots enabled. In fact, my Crashplan backups are on the same ZFS setup (two machines, two different countries), so I should be covered against corruption there as well.
Suffice to say, I’ve been burnt in the past. That seems to be the only way that anyone ever starts spending this much (that is, ‘sufficient’) effort on backups.
E.g. http://jeffreydonenfeld.com/blog/2011/12/crashplan-online-backup-lost-my-entire-backup-archive/
All of that said?
I’m paranoid. I wouldn’t trust a single backup service, even if it had never had any problems; I’d be wondering what they were covering up, or if they were so small, they’d likely go away.
Crashplan is probably fine. Probably.
I’m using Crashplan as the offsite backup, I have another backup in-house. The few anecdotes seem to be from Crashplan’s early days.
But yeah, maybe I should do a complete dump to an external hard drive once in a while and just keep it offline somewhere...
Use RAID on ZFS. RAID is not a backup solution, but with the proper RAIDZ6 configuration will protect you against common hard drive failure scenarios. Put all your files on ZFS. I use a dedicated FreeNAS file server for my home storage. Once everything you have is on ZFS, turn on snapshotting. I have my NAS configured to take a snapshot every hour during the day (set to expire in a week), and one snapshot on Monday which lasts 18 months. The short lived snapshots lets me quickly recover from brain snafus like overwriting a file.
Long lived snapshotting is amazing. Once you have filesystem snapshots, incremental backups become trivial. I have two portable hard drives, one onsite and one offsite. I plug in the hard drive, issue one command, and a few minutes later, I’ve copied the incremental snapshot to my offline drive. My backup hard drives become append only logs of my state. ZFS also lets you configure a drive so that it stores copies of data twice, so I have that turned on just to protect against the remote chance of random bitflips on the drive.
I do this monthly, and it only burns about 10 minutes a month. However, this isn’t automated. If you’re willing to trust the cloud, you could improve this and make it entirely automated with something like rsync.net’s ZFS snapshot support. I think other cloud providers also offer snapshotting now, too.
I feel that this is too complicated a solution for most people to follow. And it’s not a very secure backup system anyway.
You can just get an external hard drive and use any of the commonly-available full-drive backup software. Duplicity is a free one and it has GUI frontends that are basically just click-to-backup. You can also set them up to give you weekly reminders, etc.
Generally speaking, the best practice is to have two separate backups, one of them offsite.
First, you might want to run some kind of a RAID setup so that a single disk failure doesn’t affect much. RAID is not backup, but it’s useful.
Second, you might want to set up some automated backup/copy of your data to a different machine or to a cloud. The advantage is that it’s setup-and-forget. The disadvantage is that if you have data corruption or malware, etc. the corrupted data could overwrite your clean backup before you notice something is wrong. Because of that it would not be a bad idea to occasionally make known-clean copies of data (say, after a disk check and a malware check) on some offline media like a flash drive or an external hard drive.
Disk space is really REALLY cheap. It’s not rational :-/ to skimp on it.
I consider either tresoit or megaupload currently to be the best ways to backup data automatically both provide clientside encryption. The free version of megaupload allows for 50GB.