I also use restic and do backups to append-only rest-servers in multiple locations.
I also back up multiple hosts to the same repository, which actually results in insane storage space savings. One thing I'm missing though is being able to specify multiple repositories for one snapshot such that I have consistency across the multiple backup locations. For now the snapshots just have different ids.
I haven't tried that recently (~3 years), does that work with concurrency or do you need to ensure one backup is running at a time? Back when I tried it I got the sense that it wasn't really meant to have many machines accessing the repo at once, and decided it was probably worth wasting space but having potentially more robust backups. Especially for my home use case where I only have a couple machines I'm backing up. But it'd be pretty cool if I could replace my main backup servers (using rsync --inplace and zfs snapshots) with restic and get deduplication.
I once met the Borg author at a conference, pretty chill guy. He said that when people file bugs because of data corruption, it's because his tool found the underlying disk to be broken. Sounds quite reliable although I'm mostly fine with tar...
I used CrashPlan in 2014. Back then, their implementation of Windows's Volume Shadow Copy Service (VSS) was buggy, and I lost data because of that. I doubt my underlying disk was broken.
While saying "hardware issue not my fault not my problem" is a valid stance, I'm thinking that if you hear it again and again from your users, maybe you should consider if you can do more. Verify the file was written correctly is a low hanging fruit. Other possibilities is run some s.m.a.r.t. check and show warning, or adding redundancy to recover from partial failure.
I think the failure mode that is happening for users/devs here is bit rot. It's not that the device won't report back the same bytes, even if you disable whatever caching is happening, it's that after T amount of time it will report the wrong bytes. Some file systems have "scrubs" and stuff they do to automatically find these and sometimes attempt to repair them (ZFS can do this).
Restic is far better both in terms of usability and packaging (borgmatic pretty much is a requirement for usability). Have used both extensively, you can argue that borg can just be scripted instead and is a lot more versitile, but I had a much better experience with restic in terms of setup and forget. I am not scared that restic will break, with borg I did.
Also not sure why this was posted, did a new version release or something?
And that's what I did myself. Organically it grew to ~200 lines, but it sits in the background (created a systemd unit for it, too) and does its job. I also use rclone to store the encrypted backups in an AWS S3 bucket
I so much forget about it that sometimes I have to remind myself to test it out if it still works (it does).
Original size Compressed size Deduplicated size
All archives: 2.20 TB 1.49 TB 52.97 GB
Depends on what you consider large; I looked at one of the machines (at random), and it backups about two terabytes of data spread across about a million files. Most of them aren't changing day to day. I ran another backup, and restic rescanned them & created a snapshot in exactly 35 seconds, using ~800 MiB of RAM at peak and about 600 on average.
The files are on HDD, and the machine doesn't have a lot of RAM, looking at high I/O wait times and low CPU load overall, I'm pretty sure the bottleneck is in loading filesystem metadata off disk.
I wouldn't backup billions of files or petabytes of data with either restic or borg; stick to ZFS for anything of this scale.
I don't remember what the initial scan time was (it was many years ago), but it wasn't unreasonable — pretty sure the bottleneck also was in disk I/O.
I've been using the Vorta GUI [0] and Hetzner's Storage Box service for ages and it works great. Has saved me from some headaches.
I switched over from Duplicati a long while back when my laptop's sole HDD failed and Duplicati was giving me 143 year estimates for the restore to complete. This was true whether I aimed to restore the whole drive or just a single file.
Last time I checked the deduplication only works per host when backups are encrypted, which makes sense. Anyway, borg is one of the three backup systems I use, it's alright.
I also back up multiple hosts to the same repository, which actually results in insane storage space savings. One thing I'm missing though is being able to specify multiple repositories for one snapshot such that I have consistency across the multiple backup locations. For now the snapshots just have different ids.
I haven't tried that recently (~3 years), does that work with concurrency or do you need to ensure one backup is running at a time? Back when I tried it I got the sense that it wasn't really meant to have many machines accessing the repo at once, and decided it was probably worth wasting space but having potentially more robust backups. Especially for my home use case where I only have a couple machines I'm backing up. But it'd be pretty cool if I could replace my main backup servers (using rsync --inplace and zfs snapshots) with restic and get deduplication.
Also not sure why this was posted, did a new version release or something?
And that's what I did myself. Organically it grew to ~200 lines, but it sits in the background (created a systemd unit for it, too) and does its job. I also use rclone to store the encrypted backups in an AWS S3 bucket
I so much forget about it that sometimes I have to remind myself to test it out if it still works (it does).
Last time I used restic a few years ago, it choked on not so large data set with high memory usage. I read Borg doesn't choke like that.
The files are on HDD, and the machine doesn't have a lot of RAM, looking at high I/O wait times and low CPU load overall, I'm pretty sure the bottleneck is in loading filesystem metadata off disk.
I wouldn't backup billions of files or petabytes of data with either restic or borg; stick to ZFS for anything of this scale.
I don't remember what the initial scan time was (it was many years ago), but it wasn't unreasonable — pretty sure the bottleneck also was in disk I/O.
https://vorta.borgbase.com/
Cheap, reliable, and almost trouble-free.
Not affiliated, just a happy user.
I switched over from Duplicati a long while back when my laptop's sole HDD failed and Duplicati was giving me 143 year estimates for the restore to complete. This was true whether I aimed to restore the whole drive or just a single file.
https://vorta.borgbase.com/