I have only two issues with the `muxfs` implementation as it stands:
(1) (And the largest problem) is that it requires stable inodes in order to tie the checksums with the actual files. This means (and it's already stated in the article) you can't copy / move / overwrite any of the underlying files without losing the checksums. (Basically it also removes the possibility of accessing one of the mirrors via NFS, FUSE, or anything that doesn't have stable inodes.)
(2) (Based on my reading of the article) it doesn't seem to hold a "log" or "sequence" to identify which of the two mirrors are ahead or if they are in sync. In case of a disconnect / reconnect you need to manually tell `muxfs` which is the "newer" one (by using a `sync` before being able to mount it).
(I haven't tested it though, I'm running Linux, but I'm quite interested because just last week I thought "why doesn't one implement a FUSE file-system to add checksums and thus prevent bitrot". `muxfs` also adds mirroring.)
I am also using checksums to detect bit-rot, but in order to tie them to the files they are stored in extended attributes of the files. Thus they do not depend on the inode numbers.
OpenBSD also supports extended file attributes, so using them should be possible.
Using extended attributes on Linux or FreeBSD requires a few precautions, because there are still various copying/movement/archiving CLI commands or GUI applications that ignore the extended attributes and also some file systems that do not implement extended attributes, e.g. tmpfs on Linux (which supports only certain kinds of system extended attributes, not those defined by the users) or all not extremely new versions of NFS (only NFSv4 in Linux 5.9 or newer supports xattr, unlike samba, which has supported them for decades, mapping them correctly between different file systems, e.g. XFS on Linux to UFS on FreeBSD), so copying a file via those file systems would lose silently the extended attributes of the files.
The extended file attributes have been introduced in 1989, in HPFS for OS/2 version 1.2, and they have been brought to UNIX in XFS, in 1993.
30 years later, it is annoying to see that there are still some programs which pretend to make file copies or file archives, but which can lose the extended file attributes, without any warnings or errors.
(1) I believe that stable inode numbers are possible with FUSE but they must be implemented by the FUSE driver. Mirroring over a network is not the intended use case. The mirrors exist to provide data redundancy for use by the muxfs driver. If you need to copy the data to a new location there is the muxfs sync command.
(2) muxfs uses sequence numbers to count the write operations performed on each mirror. Upon failure to mount due to the mirrors being out of sync a report is printed comparing the first mirror with the first non-matching mirror, and this includes their sequence numbers.
Nice concept, and having skimmed it's worth noting:
> muxfs needs you!
> No filesystem can be considered stable without thorough testing and muxfs is no exception.
> Even if I had tested muxfs enough to call it stable it still would not be responsible to expect you to simply take my word for it. It is for this reason that I do not intend to release a version 1.0 until there are sufficient citations that I can make to positive, third-party evaluations of muxfs.
> This is where you can help.
> I need volunteers to test muxfs, provide feedback, and periodically publish test results.
These requirements of much testing of inherently multithreaded code with a lot riding on not destroying user data suggests a model checker would be an essential piece, perhaps someone more knowledgeable can opine.
> a filesystem should automatically check and repair data as it is accessed rather than processing the entire filesystem tree upon every check or repair job.
Except this is not sufficient. Flash storage for example is especially susceptible to random bitrot of data over time regardless of whether or not it is ever accessed or even powered on. Ever tried to plug in an old USB stick or SD card only to find out it was totally busted or unreadable? Scanning the entire filesystem and re-checksumming everything is therefore completely necessary.
There is the muxfs audit subcommand though that does that. I guess the author was trying to say that it shouldn't be the only way to do it, as that opens the door for silently returning corrupted data in-between runs, so you start pondering how low you should set the interval between audit runs etc. I guess with automatic checks on every access you can feel safe running the audit every other month or so.
> I decided it was finally time to build a file server to centralize my files and guard them against bit-rot. Although I would have preferred to use OpenBSD due to its straightforward configuration and sane defaults, I was surprised to find that none of the typical NAS filesystems were supported.
Theo is, perhaps rightfully so, against importing what is effectively a paravirtualized Solaris kernel into the OpenBSD source code in order to run a file system.
Because ZFS is not supported on OpenBSD. In fact he does mention in the beginning of the article that he was surprised that none of the NAS related file-systems are not supported by OpenBSD.
On the other side, ZFS is an overly complicated behemoth, that wants direct access to the block device. Meanwhile `muxfs` works with any already existing file-system (local or remote) and just provides the checksums. So both serve different use-cases.
>On the other side, ZFS is an overly complicated behemoth, that wants direct access to the block device. Meanwhile `muxfs` works with any already existing file-system (local or remote) and just provides the checksums. So both serve different use-cases.
Na...it's not overly complicated for what it is, but yes it is a behemoth.
>that wants direct access to the block device.
Yes for high-performance "enterprise"-setup's it is preferable, but absolutely not needed.
> Meanwhile `muxfs` works with any already existing file-system (local or remote) and just provides the checksums.
That i think is the winning point here, just add bit-rot protection to ffs.
Other than lack of support for NAS file systems in OpenBSD, is there a reason not to use ZFS (in favor of another file system providing similar features)?
The muxfs source code is a lot smaller than that of ZFS so if security is a concern to you then you might find muxfs easier to audit than ZFS. It also compiles quickly so could be a good match for a source-based system. This said I never aimed to "beat" ZFS.
I appreciate how clear you are about the chosen trade-offs and I am also quite impressed by the extent to which the install instructions include "and now here's how you check that worked" commands - I'm sure somebody reading this will think "well, yes, obviously you should include those" but it's not as common as I might like and your version thereof is notably thorough.
(1) (And the largest problem) is that it requires stable inodes in order to tie the checksums with the actual files. This means (and it's already stated in the article) you can't copy / move / overwrite any of the underlying files without losing the checksums. (Basically it also removes the possibility of accessing one of the mirrors via NFS, FUSE, or anything that doesn't have stable inodes.)
(2) (Based on my reading of the article) it doesn't seem to hold a "log" or "sequence" to identify which of the two mirrors are ahead or if they are in sync. In case of a disconnect / reconnect you need to manually tell `muxfs` which is the "newer" one (by using a `sync` before being able to mount it).
(I haven't tested it though, I'm running Linux, but I'm quite interested because just last week I thought "why doesn't one implement a FUSE file-system to add checksums and thus prevent bitrot". `muxfs` also adds mirroring.)
OpenBSD also supports extended file attributes, so using them should be possible.
Using extended attributes on Linux or FreeBSD requires a few precautions, because there are still various copying/movement/archiving CLI commands or GUI applications that ignore the extended attributes and also some file systems that do not implement extended attributes, e.g. tmpfs on Linux (which supports only certain kinds of system extended attributes, not those defined by the users) or all not extremely new versions of NFS (only NFSv4 in Linux 5.9 or newer supports xattr, unlike samba, which has supported them for decades, mapping them correctly between different file systems, e.g. XFS on Linux to UFS on FreeBSD), so copying a file via those file systems would lose silently the extended attributes of the files.
The extended file attributes have been introduced in 1989, in HPFS for OS/2 version 1.2, and they have been brought to UNIX in XFS, in 1993.
30 years later, it is annoying to see that there are still some programs which pretend to make file copies or file archives, but which can lose the extended file attributes, without any warnings or errors.
Deleted Comment
(2) muxfs uses sequence numbers to count the write operations performed on each mirror. Upon failure to mount due to the mirrors being out of sync a report is printed comparing the first mirror with the first non-matching mirror, and this includes their sequence numbers.
> muxfs needs you!
> No filesystem can be considered stable without thorough testing and muxfs is no exception.
> Even if I had tested muxfs enough to call it stable it still would not be responsible to expect you to simply take my word for it. It is for this reason that I do not intend to release a version 1.0 until there are sufficient citations that I can make to positive, third-party evaluations of muxfs.
> This is where you can help.
> I need volunteers to test muxfs, provide feedback, and periodically publish test results.
One question: do you plan to implement "concatenation" of filesystems, so you can build one very large muxfs system (10s of terabytes) ?
I don't plan to add "concatenation" as you have described, however this can, in theory, be approximated by layering muxfs on top of multiple RAID0s.
Except this is not sufficient. Flash storage for example is especially susceptible to random bitrot of data over time regardless of whether or not it is ever accessed or even powered on. Ever tried to plug in an old USB stick or SD card only to find out it was totally busted or unreadable? Scanning the entire filesystem and re-checksumming everything is therefore completely necessary.
> I decided it was finally time to build a file server to centralize my files and guard them against bit-rot. Although I would have preferred to use OpenBSD due to its straightforward configuration and sane defaults, I was surprised to find that none of the typical NAS filesystems were supported.
OpenBSD does not support ZFS.
On the other side, ZFS is an overly complicated behemoth, that wants direct access to the block device. Meanwhile `muxfs` works with any already existing file-system (local or remote) and just provides the checksums. So both serve different use-cases.
Na...it's not overly complicated for what it is, but yes it is a behemoth.
>that wants direct access to the block device.
Yes for high-performance "enterprise"-setup's it is preferable, but absolutely not needed.
> Meanwhile `muxfs` works with any already existing file-system (local or remote) and just provides the checksums.
That i think is the winning point here, just add bit-rot protection to ffs.
You can set up a ZFS pool backed by files[1]. Probably not something you should do with data you really care about, but it's possible.
[1]: https://linux.die.net/man/8/zpool (Virtual Devices)