"While I know of some really good cloud providers, such as rsync.net and Tarsnap, I recommend that you never trust cloud providers blindly."
This is very good advice.
That being said, humans need heuristics and shortcuts to aid in decision-making. I hope the fact that rsync.net has been doing this work since 2001 is helpful in that regard.
...
"There exist some really cool open source backup solutions such as Borg, Restic and duplicity, but you should never rely solely on these "complex" solutions. These tools work really great, until they don't! In the past I have lost data to duplicity and other tools."
I think this is very good advice as well - and that is coming from someone who has whole-heartedly endorsed 'borg' as a backup tool and regularly recommends it. It is the "holy grail of backups"[1] after all ...
It's also a magic black box for most users and would be difficult to work out failures.
The safest and (in my opinion) most useful workflow is to back up your data locally to some kind of NAS or fileserver using plain old rsync and then back up that fileserver to rsync.net (or whomever) using the fancy borg tool.
Now you have quick and simple local restores but still have a backup in the cloud that requires zero trust in the provider.
> That being said, humans need heuristics and shortcuts to aid in decision-making. I hope the fact that rsync.net has been doing this work since 2001 is helpful in that regard.
This is implied by the 'blindly' part. Searching "cloud storage provider", seeing rsync.net listed and picking it with a thrown dart would be blind. A quick search to see that it's been around for a while and doesn't have any crazy horror stories attached is part of becoming informed.
In case you didn't notice or realize or recognize, GP is the founder of rsync.net (I'm not saying this to assign any ulterior motive for that comment).
> Everything can look really nice "on paper" but you don't know what goes on behind the scenes. I have worked with a lot of different people and I have seen too much crazy shit to fully trust anyone with my important data. A cloud provider may have the best of intentions, but sometimes all it takes is a single grumpy employee or even a minor mistake to do a lot of damage.
OneDrive and Google Drive are both pretty cheap. Is there anything wrong with keeping a backup of your important data in one of them? At a certain point you have to live your life and take a chance. Sure, I never made it to Italy, but I had a 100% safe backup system for my files said noone ever.
> Free Git hosting such as GitHub, GitLab and others can also be utilized for data that you don't mind storing in public. GitLab and other providers does provide free private repositories, just don't rely fully on that.
At this point, it's clear the author is looking for arguments to make. Of course you're not going to dump all your stuff into a GitLab repo in the cloud. You're going to clone it on multiple machines! My important work stuff is under version control and cloned on multiple machines in multiple locations. If that's not good enough, I'll live with the consequences.
Time to time I read about people who are randomly banned/locked account by google. I had 1.5TB of my precious memory of my daughter all the way back when she was born,
But I keep a local backup on my old mac, basically a bunch of external hard-drive but it's a pain in the ass to manage thse.
The odds of you losing the master data at the same time as being locked out of your account are astronomically low, and if they don't coincide, then you're not in any trouble (e.g. if you get locked out of your account, then you immediately make another backup e.g. on a USB, and if your main copy has something happen to it, you immediately restore).
I definitely agree that getting locked out of your Google account is a massive risk in general (as well as morally bankrupt on their part), but I don't think it's a problem for the specific case of backups).
I trust major cloud providers like Google or Microsoft to protect my data far more than I'd ever trust a bunch of retail hardware I plugged together and configured myself.
They have entire gigantic teams of employees dedicated to security and privacy and protection from threats. I couldn't replicate that even if I wanted to.
If someone wants to steal and leak your sensitive data, they'll have a much easier time getting into your home hardware (whether over a network or physically or both) than they will getting it out of Google Drive, provided you have 2FA and good passwords you keep memorized.
How common an occurence is that? How often is an unimportant, middle class person's data at risk, really? Enough that you'd want to spin up your ZFS storage?
Hypothetically, let's say I had my entire life on Google. I have a unique password for it, backed up by 2FA, without the SMS/Authenticator fallback. What's the long term consequence? Google knows everything about me? They already do anyway. Someone can steal my printout of the backup codes?
I don't ask this to stir shit. I genuinely have these sorts of discussions with friends and family when I try to tell them that privacy is important, and I fail absolutely at convincing them of it.
Data stored in Google Drive etc should be encrypted and split into 50 MB chunks or something like that to hide metadata and mitigate the risk of leaks. Better backup tools have been offering this for a long time.
Darn, I was hoping this would be an article about organizing one's files. I was really in the mood for reading about that, then spending the rest of the morning reworking my own system.
Same. I get lost in researching everyone else's file organization and other workflow methods. I always feel like mine is a dog's breakfast, scattered across many incongruent, un-coordinated drives, clouds, shares, etc.
Something tricky is how to bootstrap restoration of backup. If you have lost "everything", how do you get it back?
For example, if you use borg to backup remotely via ssh, you will need ssh keys as well as a passphrase for the encrypted backup. Where do you store those to make sure you have them if your computer is gone? What I did was create a self-extracting restoration script, which embedded everything needed. This is also encrypted, and synced to many places. The idea is, as long as I have the passphrase for that, it takes care of the rest.
I keep the most important pieces on printout in a firesafe lockbox. I don’t have any eventuality for if the firesafe goes at the same time as online failsafes, but I feel like the best approach for those kinds of situations is nihilism.
It's a good discussion. I've taken a different path but with the same ideals.
1. All my live, working data is in Dropbox. I've been using my (paid) Dropbox account for over a decade. This step isn't really intended to be a backup per se, but you get the backup behavior "for free" because this is how I keep my secondary machine in sync with my primary. My Dropbox folder is also replicated to a 3rd computer in my home as another level of redundancy.
2. Everything in my home dir, especially photography, is ALSO covered by Backblaze. Having some backups elsewhere is mandatory; not enough people really understand how important this is. Should my house burn down, I still have data.
3. My primary system is a Mac, so I use Time Machine. TM is the only backup on this list I've ever actually used as a backup. When our home was robbed a number of years ago in a quickie smash-window-and-grab affair, they got my laptop. I went out and bought a new computer, plugged it into the TM drive, and in an hour or so I was right back where I left off. Hard to beat that. Even my app windows were in the same place.
4. Periodically, I take a full clone of my main machine's drive using a drive imaging tool (Mac specific; I use SuperDuper) and **store that drive at a friend's house**. This probably only happens a few times a year a this point. I should do it more often.
That tertiary computer I mentioned in step 1 is also the home NAS server / home media server. It holds the photo archive in a large outboard disk. Backblaze covers that disk, and Time Machine on that computer keeps the outboard disk backed up as well. This data is mostly static, so the images I've taken of it and stored elsewhere don't need to be updated all that often (ie, just when I migrate prior year photo data to that drive).
I put everything in a Resilio Sync folder, and keep a full sync on at least two devices (a home NAS and a cloud seedbox). Resilio Sync handles pretty much everything. You instantly get hot backup, have files immediately available on every device you have, and if you have a phone you can download any file on demand, etc. Unlike other file synchronization methods such as `rsync --delete`, it keeps a version for every file modification and moves the file to Archive when it is deleted, so you can't lose data. Also, you get encryption without the headache by using the "encrypted folder key". This syncs an encrypted copy so other devices can sync from that device.
I use resilio sync together with a NAS (which is still large enough to hold all my files, I wonder how long that'll stay given how much 6k BRAW footage I'm shooting), and I also sync to a Google Cloud Storage bucket from my NAS (moving items to Archive storage class if they haven't been touched in a while, significantly saving cost).
> Not only does encryption during data recovery make everything much more difficult, but should you pass away, your family members might not have the skills required to access the data.
Terrible advice regarding encryption. Really if you have important data that your family needs at the time of your death you should have a plan for that as well. Not avoid encryption because it’s “too hard”.
Exactly. Encrypt with a simple utility, and write down the password and store in a safe place and let someone know. In my case, I have a friend who is very technical and I can trust to assist my spouse (who is not technical) with some items if need be. Part of my hand-written instructions are his phone number/email, and to call him if there are issues decrypting or restoring data.
> While you might consider doing a full encryption for both your personal laptop and/or desktop, in case one of these gets stolen, you should avoid encryption on backup and storage when it really isn't needed because encryption adds yet another layer of complexity.
Wrong. Not encrypting your backups when your daily use systems are encrypted makes no sense. Seriously. Why go through the hassle of figuring out a password, when one can steal the unencrypted off site backup?
You want simple encryption? Me too! Use dm-crypt/LUKS. It's been in every modern Linux distro for at least 10 years. If you can plug your encrypted external drive into a live/freshly installed Linux desktop and a encryption prompt comes up, you're in luck!
I agree with your general sentiment, but I'll provide one redeeming counterpoint anyway: your laptop (that you carry around to places) is at higher risk of theft than an external hard drive that you keep in a storage locker.
This is very good advice.
That being said, humans need heuristics and shortcuts to aid in decision-making. I hope the fact that rsync.net has been doing this work since 2001 is helpful in that regard.
...
"There exist some really cool open source backup solutions such as Borg, Restic and duplicity, but you should never rely solely on these "complex" solutions. These tools work really great, until they don't! In the past I have lost data to duplicity and other tools."
I think this is very good advice as well - and that is coming from someone who has whole-heartedly endorsed 'borg' as a backup tool and regularly recommends it. It is the "holy grail of backups"[1] after all ...
It's also a magic black box for most users and would be difficult to work out failures.
The safest and (in my opinion) most useful workflow is to back up your data locally to some kind of NAS or fileserver using plain old rsync and then back up that fileserver to rsync.net (or whomever) using the fancy borg tool.
Now you have quick and simple local restores but still have a backup in the cloud that requires zero trust in the provider.
[1] https://www.stavros.io/posts/holy-grail-backups/
This is implied by the 'blindly' part. Searching "cloud storage provider", seeing rsync.net listed and picking it with a thrown dart would be blind. A quick search to see that it's been around for a while and doesn't have any crazy horror stories attached is part of becoming informed.
OneDrive and Google Drive are both pretty cheap. Is there anything wrong with keeping a backup of your important data in one of them? At a certain point you have to live your life and take a chance. Sure, I never made it to Italy, but I had a 100% safe backup system for my files said noone ever.
> Free Git hosting such as GitHub, GitLab and others can also be utilized for data that you don't mind storing in public. GitLab and other providers does provide free private repositories, just don't rely fully on that.
At this point, it's clear the author is looking for arguments to make. Of course you're not going to dump all your stuff into a GitLab repo in the cloud. You're going to clone it on multiple machines! My important work stuff is under version control and cloned on multiple machines in multiple locations. If that's not good enough, I'll live with the consequences.
But I keep a local backup on my old mac, basically a bunch of external hard-drive but it's a pain in the ass to manage thse.
I definitely agree that getting locked out of your Google account is a massive risk in general (as well as morally bankrupt on their part), but I don't think it's a problem for the specific case of backups).
They have entire gigantic teams of employees dedicated to security and privacy and protection from threats. I couldn't replicate that even if I wanted to.
If someone wants to steal and leak your sensitive data, they'll have a much easier time getting into your home hardware (whether over a network or physically or both) than they will getting it out of Google Drive, provided you have 2FA and good passwords you keep memorized.
Hypothetically, let's say I had my entire life on Google. I have a unique password for it, backed up by 2FA, without the SMS/Authenticator fallback. What's the long term consequence? Google knows everything about me? They already do anyway. Someone can steal my printout of the backup codes?
I don't ask this to stir shit. I genuinely have these sorts of discussions with friends and family when I try to tell them that privacy is important, and I fail absolutely at convincing them of it.
https://johnnydecimal.com/
Down the rabbit hole I go...
For example, if you use borg to backup remotely via ssh, you will need ssh keys as well as a passphrase for the encrypted backup. Where do you store those to make sure you have them if your computer is gone? What I did was create a self-extracting restoration script, which embedded everything needed. This is also encrypted, and synced to many places. The idea is, as long as I have the passphrase for that, it takes care of the rest.
Have't needed to use them yet, but testing it was great.
1. All my live, working data is in Dropbox. I've been using my (paid) Dropbox account for over a decade. This step isn't really intended to be a backup per se, but you get the backup behavior "for free" because this is how I keep my secondary machine in sync with my primary. My Dropbox folder is also replicated to a 3rd computer in my home as another level of redundancy.
2. Everything in my home dir, especially photography, is ALSO covered by Backblaze. Having some backups elsewhere is mandatory; not enough people really understand how important this is. Should my house burn down, I still have data.
3. My primary system is a Mac, so I use Time Machine. TM is the only backup on this list I've ever actually used as a backup. When our home was robbed a number of years ago in a quickie smash-window-and-grab affair, they got my laptop. I went out and bought a new computer, plugged it into the TM drive, and in an hour or so I was right back where I left off. Hard to beat that. Even my app windows were in the same place.
4. Periodically, I take a full clone of my main machine's drive using a drive imaging tool (Mac specific; I use SuperDuper) and **store that drive at a friend's house**. This probably only happens a few times a year a this point. I should do it more often.
That tertiary computer I mentioned in step 1 is also the home NAS server / home media server. It holds the photo archive in a large outboard disk. Backblaze covers that disk, and Time Machine on that computer keeps the outboard disk backed up as well. This data is mostly static, so the images I've taken of it and stored elsewhere don't need to be updated all that often (ie, just when I migrate prior year photo data to that drive).
Deleted Comment
Terrible advice regarding encryption. Really if you have important data that your family needs at the time of your death you should have a plan for that as well. Not avoid encryption because it’s “too hard”.
Wrong. Not encrypting your backups when your daily use systems are encrypted makes no sense. Seriously. Why go through the hassle of figuring out a password, when one can steal the unencrypted off site backup?
You want simple encryption? Me too! Use dm-crypt/LUKS. It's been in every modern Linux distro for at least 10 years. If you can plug your encrypted external drive into a live/freshly installed Linux desktop and a encryption prompt comes up, you're in luck!