BorgBackup user here and really happy. It was a set and forget for me and after 7 years, the deduplicated backup is still working flawlessly each week. I recommend pairing it with borgmatic [1], which helps to design away some of the complexities of the underlying borg backup.
My problem is I learn some tool like this, set it, and then indeed forget it. Then I avoid testing my backups because of the work it takes to un-forget it. Because of this, I'm leaning more and more towards rsync or tools that have GUI frontends.
Rather than avoid tools that work well, I would encourage you to adopt solutions that solve your use cases. For instance, if you aren't getting notifications that a backup is running, completing or failing, then all you've set up is a backup job, and not built a BDR process. If you're looking for a tool to solve your entire BDR plan, then you're looking at a commercial solution that bakes in automating restore testing, so on and so forth.
Not considering all the aspects of a BDR process is what leads to this problem. Not the tool.
At a minimum you need backup, regular restore tests, and alerts when backups stop or restore tests fail.
Personally I automate restore testing with cron. I have a script that picks two random files from the filesystem: an old one (which should be in long term storage) and a new one (should be in the most recent backup run, more or less), and tries restoring them both and comparing md5sums to the live file. I like this for two reasons: 1. it's easy to alert when a cronjob fails, and 2. I always have a handy working snippet for restoring from backups when I inevitably forget how to use the tooling.
IMO alerting is the trickiest part of the whole setup. I've never really gotten that down on my own.
Borgmatic runs consistency checks [1] once a month on all repositories and archives and I occasionally retrieve older versions for selected files (archives with --verify-data only once a year or whenever I feel the need - there's 9TB of data in the borg repo, which takes a bit to scan). Note though that borg is not my main backup, it is the fallback "3" in the 3-2-1 principle, where my primary data is a ZFS Raidz2 and my primary backup is an offsite ZFS Raidz2 in pull mode. I added borg because I did not want to rely on a single software (ZFS), although this fear was unstained so far.
This always gets repeated, sounds good and makes sense theoretically but in reality there's no good way to do that and it should be the job of a computer to that.
Restoring one file from the backup, works but what if something else is corrupted?
Restoring the system from the image, works but what if some directory is not in the backup and you don't see that while testing?
Currently I'm just using bare rclone to backup to my own remote machines, but obviously this isn't very professional solution. Was thinking to add Backblaze B2 as a remote, but I guess using rclone wouldn't be a state-of-the-art solution here. After all, it isn't really a backup tool, is it? It has some built-in encryption, but it's a bit clunky, and I'd think a proper backup tool should automatically divide data into blocks of suitable size (instead of just creating file-per-file - to make it S3/B2 API-friendly), encode whole directories as tar (if needed to preserve links, for example), do deduplication, and whatever else are best practices I have no idea about, but which backup-proficient people probably invented long time ago.
Does anybody have a recommendation?
I briefly looked at restic and duplicati, but surprisingly none are as simple to use as I'd expect a dedicated backup-tool to be (I don't need, and kindda don't want GUI, I'd like all configuration to be stored in a single config-file I can just back-up to a different location like everything else, and re-create on any new machine). More than that, I've read some scary stories about these tools fucking up their indexes so that data turns out to be non-restorable, which sounds insane, since this is something you must be absolutely sure your backup-tool would never do no matter what, because what's even the point of making backups then.
>I'd like all configuration to be stored in a single config-file I can just back-up to a different location like everything else, and re-create on any new machine
You might want to look into kopia. It accomplishes the same task as restic, but handles configs in a way you might find more appealing. Further reading: https://news.ycombinator.com/item?id=34154052
Don't even bother with duplicati. I've tried to make it work so many times, but it's just a buggy mess that always fails. It's a shame too, because I really like the interface.
I've been using bupstash since trying to do backups on an rpi and finding Borg too slow to be usable. Since then I upgraded to a proper server at home but kept bupstash as I found it to just work better for the most part.
Keep in mind there's not been much progress since the last release two years ago and its still tagged as beta by the author. Tbf I think he has a higher quality standard than in other projects that are not tagged as such.
Whether something is simple or not I'd say depends on the use case. But I found borg to be great. I'd recommend you check it out and go through the quickstart guide in the documentation. It does de-duplication and encryption. It does a lot more but you don't have to use those features if you don't need them. I couple it with borgmatic to implement a backup and disaster recovery procedure that is meant to decrease the risk of data loss. I also use borgbase and they have a good service but using something like B2 with this rclone support would result in a cheaper alternative if you don't need the extra that borgbase provides.
I've been using it for quite a while now both for my personal projects and paid work and have had a good experience with it.
Borg 2 is still beta and Kopia is also there. But it's newer so I am testing it on another redundant backup on the same machine. I have space so why not?
Every once in a while I run integrity check (with data) so I can trust that metadata and data are fine.
I'm very happy with Restic backing up to BackBlaze B2.
I have a "config file", which is really just a shell script to set up the environment (repository location, etc), run the desired backups, and execute the appropriate prune command to implement my desired retention schedule.
I've been using this setup for years with great success. I've never had to do a full restore, but my experience restoring individual files and directories has been fine.
Do you have any links related to the index corruption issue? I've never encountered it, but obviously a sample size of one isn't very useful.
Writing an rclone backend for borg is something I have wanted to do for a long time.
However I found that the backends weren't well abstracted enough in v1 to make that easy.
However for v2 Thomas Waldmann has made a nice abstracted interface and the rclone code ended up being being only <300 lines of Python which only took an afternoon or two to make.
Oh very interesting. This has been a requested feature for a while especially with the rise in popularity and the decreased cost of object storage.
Borg working with object storage was not supported though some people did use it that way. From my understanding, most would duplicate a repo and upload instead of borg directly writing/manipulating it. This could problematic if the original repo was corrupt as now the corruption would be duplicated. So this will make things much easier and allow for a more streamlined workflow. Having the tool support rclone instead of specific services seems like a wise and more future-proof choice to me.
Does anyone have an up-to-date comparison of Borg vs Restic? Or a compelling reason to switch from Restic to Borg?
I've previously used Borg, but the inability to use anything other than local files or ssh as a backend became a problem for me. I switched to Restic around the time it gained compression support. So for my use-case of backing up various servers to an S3-compatible storage provider, Restic and Borg now seem to be equivalent.
Obviously I don't want to fix what isn't broken, but I'd also like to know what I'm missing out on by using Restic instead of Borg.
I prefer restic simply because I find it easier to understand and use. This means backups actually happen. It also feels less like it is constantly changing. Constant stream of new features isn’t a thing I’ve ever desired in a backup solution.
Comparisons might be interesting, but one needs to be aware that they would be a bit apples to oranges:
- unreleased code that is still in heavy development (borg2, especially the new repository code inside borg2).
- released code (restic) that has practically proven "cloud support" since quite a while.
borg2 is using rclone for the cloud backend, so that part is at least quite proven, but the layers above that in borg2 are all quite fresh and not much optimized / debugged yet.
If you're looking for cheap online storage for your backups know this: A Microsoft 365 Single subscription comes with 1 TB of OneDrive space (Family subscriptions with 1 TB per person).
I've been using it with restic + rclone successfully for years. It's not very fast, but works.
> It is possible to get interactive SSH access, but this access is limited. It is not possible to have interactive access via port 22, but it is possible via port 23. There is no full shell. For example, it is not possible to use pipes or redirects. It is also not possible to execute uploaded scripts.
For personal use, at what point would one recommend using Borg over a regular rsync?
I currently use rsync to backup up a set of directories on a drive to another drive and a remote service (rsync.net). It's been working great, but I'm not sure if my use-case is just simple enough where this is a good solution, or if I'm missing a big benefit of Borg. I do envy Borg's encryption, but the complexity of a new tool tied with the paranoia of me maybe screwing up all my data has had me on edge a bit to make the leap. I don't have a ton of data to backup, say about 5TB at the moment.
For me, the deduping and compression saves a lot of storage. My mail backup (17 backups covering the last 6 months) is originally 837GB, compressed to 312GB and dedupe'd to 19GB. Same with Postgres - 25GB to 7GB to 900MB.
You could probably use rsync's hard linking to save space on the mail backup but I'm not sure you'd get it as small without faffing about.
Usual problem, if you delete/corrupt a file and find out two days later, your daily backup is not going to help you. Having more than one snapshot is very valuable.
Rsync backups can be setup to deal with this. I have rsync setup with daily incremental backups, the main sync to a 'current' folder and the old version of changed files staying in a weekday named folder (eg. Monday). So I have a rotating 7 day period to recover files. On top of that I have a monthly long term backup of the last old version of that month. This provides an arbitraribly long monthly window to recover from. Rsync is very versatile.
With rsync, you’re replicating only the last state.
With borg, you can see all backups being made and rollback to any previous snapshot. This is true of a lot of backup solutions btw.
Concretely, if you inadvertently delete a file and this get rsynced, you cannot use the backup to restore that file. With borg you can.
[1]: https://github.com/borgmatic-collective/borgmatic
https://vorta.borgbase.com/
Then everything gets backed up to my local server which then syncs out to remote storage. It's great.
Can't wait for Borg 2 to hit stable. The transfer command solves so many problems
Not considering all the aspects of a BDR process is what leads to this problem. Not the tool.
Personally I automate restore testing with cron. I have a script that picks two random files from the filesystem: an old one (which should be in long term storage) and a new one (should be in the most recent backup run, more or less), and tries restoring them both and comparing md5sums to the live file. I like this for two reasons: 1. it's easy to alert when a cronjob fails, and 2. I always have a handy working snippet for restoring from backups when I inevitably forget how to use the tooling.
IMO alerting is the trickiest part of the whole setup. I've never really gotten that down on my own.
Please tell me you verify your backups now and then?
[1]: https://borgbackup.readthedocs.io/en/stable/usage/check.html
Restoring one file from the backup, works but what if something else is corrupted?
Restoring the system from the image, works but what if some directory is not in the backup and you don't see that while testing?
Then one can't call it "set and forget", right?
Maybe try SeedVault?
Does anybody have a recommendation?
I briefly looked at restic and duplicati, but surprisingly none are as simple to use as I'd expect a dedicated backup-tool to be (I don't need, and kindda don't want GUI, I'd like all configuration to be stored in a single config-file I can just back-up to a different location like everything else, and re-create on any new machine). More than that, I've read some scary stories about these tools fucking up their indexes so that data turns out to be non-restorable, which sounds insane, since this is something you must be absolutely sure your backup-tool would never do no matter what, because what's even the point of making backups then.
You might want to look into kopia. It accomplishes the same task as restic, but handles configs in a way you might find more appealing. Further reading: https://news.ycombinator.com/item?id=34154052
Don't even bother with duplicati. I've tried to make it work so many times, but it's just a buggy mess that always fails. It's a shame too, because I really like the interface.
Useful backup tool comparison: https://github.com/deajan/backup-bench
I've been using it for quite a while now both for my personal projects and paid work and have had a good experience with it.
Borg 2 is still beta and Kopia is also there. But it's newer so I am testing it on another redundant backup on the same machine. I have space so why not?
Every once in a while I run integrity check (with data) so I can trust that metadata and data are fine.
I have a "config file", which is really just a shell script to set up the environment (repository location, etc), run the desired backups, and execute the appropriate prune command to implement my desired retention schedule.
I've been using this setup for years with great success. I've never had to do a full restore, but my experience restoring individual files and directories has been fine.
Do you have any links related to the index corruption issue? I've never encountered it, but obviously a sample size of one isn't very useful.
However I found that the backends weren't well abstracted enough in v1 to make that easy.
However for v2 Thomas Waldmann has made a nice abstracted interface and the rclone code ended up being being only <300 lines of Python which only took an afternoon or two to make.
https://github.com/borgbackup/borgstore/blob/master/src/borg...
Borg working with object storage was not supported though some people did use it that way. From my understanding, most would duplicate a repo and upload instead of borg directly writing/manipulating it. This could problematic if the original repo was corrupt as now the corruption would be duplicated. So this will make things much easier and allow for a more streamlined workflow. Having the tool support rclone instead of specific services seems like a wise and more future-proof choice to me.
Borg 2.0 beta (deduplicating backup program with compression and encryption) - https://news.ycombinator.com/item?id=40990425 - July 2024 (1 comment)
Borgctl – borgbackup without bash scripts - https://news.ycombinator.com/item?id=39289656 - Feb 2024 (1 comment)
BorgBackup: Deduplicating archiver with compression and encryption - https://news.ycombinator.com/item?id=34152369 - Dec 2022 (177 comments)
Emborg – Front-End to Borg Backup - https://news.ycombinator.com/item?id=30035308 - Jan 2022 (2 comments)
Deduplicating Archiver with Compression and Encryption - https://news.ycombinator.com/item?id=27939412 - July 2021 (71 comments)
BorgBackup: Deduplicating Archiver - https://news.ycombinator.com/item?id=21642364 - Nov 2019 (103 comments)
Borg – Deduplicated backup with compression and authenticated encryption - https://news.ycombinator.com/item?id=13149759 - Dec 2016 (1 comment)
BorgBackup (short: Borg) is a deduplicating backup program - https://news.ycombinator.com/item?id=11192209 - Feb 2016 (1 comment)
I've previously used Borg, but the inability to use anything other than local files or ssh as a backend became a problem for me. I switched to Restic around the time it gained compression support. So for my use-case of backing up various servers to an S3-compatible storage provider, Restic and Borg now seem to be equivalent.
Obviously I don't want to fix what isn't broken, but I'd also like to know what I'm missing out on by using Restic instead of Borg.
- unreleased code that is still in heavy development (borg2, especially the new repository code inside borg2).
- released code (restic) that has practically proven "cloud support" since quite a while.
borg2 is using rclone for the cloud backend, so that part is at least quite proven, but the layers above that in borg2 are all quite fresh and not much optimized / debugged yet.
I've been using it with restic + rclone successfully for years. It's not very fast, but works.
> It is possible to get interactive SSH access, but this access is limited. It is not possible to have interactive access via port 22, but it is possible via port 23. There is no full shell. For example, it is not possible to use pipes or redirects. It is also not possible to execute uploaded scripts.
https://docs.hetzner.com/storage/storage-box/access/access-s...
https://docs.hetzner.com/storage/storage-box/access/access-s...
Deleted Comment
I currently use rsync to backup up a set of directories on a drive to another drive and a remote service (rsync.net). It's been working great, but I'm not sure if my use-case is just simple enough where this is a good solution, or if I'm missing a big benefit of Borg. I do envy Borg's encryption, but the complexity of a new tool tied with the paranoia of me maybe screwing up all my data has had me on edge a bit to make the leap. I don't have a ton of data to backup, say about 5TB at the moment.
You could probably use rsync's hard linking to save space on the mail backup but I'm not sure you'd get it as small without faffing about.
http://www.taobackup.com/ etc
Rsync is also very slow with lots of files, and doesn't deal with renamed files (will transfer again).
Concretely, if you inadvertently delete a file and this get rsynced, you cannot use the backup to restore that file. With borg you can.