Muxfs – a mirroring, checksumming, and self-healing filesystem layer for OpenBSD

I have only two issues with the `muxfs` implementation as it stands:

(1) (And the largest problem) is that it requires stable inodes in order to tie the checksums with the actual files. This means (and it's already stated in the article) you can't copy / move / overwrite any of the underlying files without losing the checksums. (Basically it also removes the possibility of accessing one of the mirrors via NFS, FUSE, or anything that doesn't have stable inodes.)

(2) (Based on my reading of the article) it doesn't seem to hold a "log" or "sequence" to identify which of the two mirrors are ahead or if they are in sync. In case of a disconnect / reconnect you need to manually tell `muxfs` which is the "newer" one (by using a `sync` before being able to mount it).

(I haven't tested it though, I'm running Linux, but I'm quite interested because just last week I thought "why doesn't one implement a FUSE file-system to add checksums and thus prevent bitrot". `muxfs` also adds mirroring.)

adrian_b · 4 years ago

I am also using checksums to detect bit-rot, but in order to tie them to the files they are stored in extended attributes of the files. Thus they do not depend on the inode numbers.

OpenBSD also supports extended file attributes, so using them should be possible.

Using extended attributes on Linux or FreeBSD requires a few precautions, because there are still various copying/movement/archiving CLI commands or GUI applications that ignore the extended attributes and also some file systems that do not implement extended attributes, e.g. tmpfs on Linux (which supports only certain kinds of system extended attributes, not those defined by the users) or all not extremely new versions of NFS (only NFSv4 in Linux 5.9 or newer supports xattr, unlike samba, which has supported them for decades, mapping them correctly between different file systems, e.g. XFS on Linux to UFS on FreeBSD), so copying a file via those file systems would lose silently the extended attributes of the files.

The extended file attributes have been introduced in 1989, in HPFS for OS/2 version 1.2, and they have been brought to UNIX in XFS, in 1993.

30 years later, it is annoying to see that there are still some programs which pretend to make file copies or file archives, but which can lose the extended file attributes, without any warnings or errors.

sdadams · 4 years ago

OpenBSD removed support for extended attributes.

Deleted Comment

sdadams · 4 years ago

(1) I believe that stable inode numbers are possible with FUSE but they must be implemented by the FUSE driver. Mirroring over a network is not the intended use case. The mirrors exist to provide data redundancy for use by the muxfs driver. If you need to copy the data to a new location there is the muxfs sync command.

(2) muxfs uses sequence numbers to count the write operations performed on each mirror. Upon failure to mount due to the mirrors being out of sync a report is printed comparing the first mirror with the first non-matching mirror, and this includes their sequence numbers.

i can't find "zfs" mentioned once in this guy's doc so my first question is... why not?

laumars · 4 years ago

Same reason he doesn’t mention BtrFS and a bunch of other file systems: Because he’s running OpenBSD which doesn’t support ZFS.

danielparks · 4 years ago

From the page:

> I decided it was finally time to build a file server to centralize my files and guard them against bit-rot. Although I would have preferred to use OpenBSD due to its straightforward configuration and sane defaults, I was surprised to find that none of the typical NAS filesystems were supported.

OpenBSD does not support ZFS.

hestefisk · 4 years ago

Theo is, perhaps rightfully so, against importing what is effectively a paravirtualized Solaris kernel into the OpenBSD source code in order to run a file system.

ciprian_craciun · 4 years ago

Because ZFS is not supported on OpenBSD. In fact he does mention in the beginning of the article that he was surprised that none of the NAS related file-systems are not supported by OpenBSD.

On the other side, ZFS is an overly complicated behemoth, that wants direct access to the block device. Meanwhile `muxfs` works with any already existing file-system (local or remote) and just provides the checksums. So both serve different use-cases.

swinglock · 4 years ago

Non ZFS filesystems are overly simplified, ignoring the problems they ought to be solving.

nix23 · 4 years ago

>On the other side, ZFS is an overly complicated behemoth, that wants direct access to the block device. Meanwhile `muxfs` works with any already existing file-system (local or remote) and just provides the checksums. So both serve different use-cases.

Na...it's not overly complicated for what it is, but yes it is a behemoth.

>that wants direct access to the block device.

Yes for high-performance "enterprise"-setup's it is preferable, but absolutely not needed.

> Meanwhile `muxfs` works with any already existing file-system (local or remote) and just provides the checksums.

That i think is the winning point here, just add bit-rot protection to ffs.

magicalhippo · 4 years ago

> that wants direct access to the block device

You can set up a ZFS pool backed by files[1]. Probably not something you should do with data you really care about, but it's possible.

[1]: https://linux.die.net/man/8/zpool (Virtual Devices)

defrost · 4 years ago

Nice concept, and having skimmed it's worth noting:

> muxfs needs you!

> No filesystem can be considered stable without thorough testing and muxfs is no exception.

> Even if I had tested muxfs enough to call it stable it still would not be responsible to expect you to simply take my word for it. It is for this reason that I do not intend to release a version 1.0 until there are sufficient citations that I can make to positive, third-party evaluations of muxfs.

> This is where you can help.

> I need volunteers to test muxfs, provide feedback, and periodically publish test results.

zasdffaa · 4 years ago

These requirements of much testing of inherently multithreaded code with a lot riding on not destroying user data suggests a model checker would be an essential piece, perhaps someone more knowledgeable can opine.

OpenBSD's FUSE does not implement multithreading.

Author here. A big thank you to you all for your interest in muxfs! I will try to answer all of your questions as best I can.

j_not_j · 4 years ago

The concept sounds good, but you also need people to review the FUSE implementation as well as muxfs. And test, and so on.

One question: do you plan to implement "concatenation" of filesystems, so you can build one very large muxfs system (10s of terabytes) ?

The OpenBSD FUSE implementation is in base so it should already be well audited.

I don't plan to add "concatenation" as you have described, however this can, in theory, be approximated by layering muxfs on top of multiple RAID0s.

ranger_danger · 4 years ago

> a filesystem should automatically check and repair data as it is accessed rather than processing the entire filesystem tree upon every check or repair job.

Except this is not sufficient. Flash storage for example is especially susceptible to random bitrot of data over time regardless of whether or not it is ever accessed or even powered on. Ever tried to plug in an old USB stick or SD card only to find out it was totally busted or unreadable? Scanning the entire filesystem and re-checksumming everything is therefore completely necessary.

iforgotpassword · 4 years ago

There is the muxfs audit subcommand though that does that. I guess the author was trying to say that it shouldn't be the only way to do it, as that opens the door for silently returning corrupted data in-between runs, so you start pondering how low you should set the interval between audit runs etc. I guess with automatic checks on every access you can feel safe running the audit every other month or so.

ikiris · 4 years ago

aborsy · 4 years ago

Other than lack of support for NAS file systems in OpenBSD, is there a reason not to use ZFS (in favor of another file system providing similar features)?

The muxfs source code is a lot smaller than that of ZFS so if security is a concern to you then you might find muxfs easier to audit than ZFS. It also compiles quickly so could be a good match for a source-based system. This said I never aimed to "beat" ZFS.

mst · 4 years ago

I appreciate how clear you are about the chosen trade-offs and I am also quite impressed by the extent to which the install instructions include "and now here's how you check that worked" commands - I'm sure somebody reading this will think "well, yes, obviously you should include those" but it's not as common as I might like and your version thereof is notably thorough.

tbe · 4 years ago

Would muxfs be easily portable to other operating systems, or does it rely on non-standard APIs?

pmarreck · 4 years ago

Do you use forward error correction? If so, which algorithm?

No error correcting codes currently. Just checksums and redundant copies. Currently the supported checksum algorithms are crc32, md5, and sha1.