Accelerating ZFS with Copy Offloading: BRT

Very interesting feature! Can anyone think of a real world use case for this?

anotherhue · a year ago

Instantiating VMs from snapshots I assume.

allanjude · a year ago

There are a few different use cases, but cloning a VM image file is definitely a popular one.

Also, `mv` between different filesystems in the same ZFS pool. Traditionally when crossing filesystems doesn't allow just using `rename()`, `mv` resorted to effectively `cp` then `rm`, so at least temporarily required 2x the space, and that space might not be freed for a long time if you have snapshots.

With BRT, the copy to the 2nd filesystem doesn't need to write anything more than a bit of metadata, and then when you remove the source copy, it actually removes the BRT entry, so there is no long-term overhead.

One of the original developer's use cases was restoring a file from a snapshot, without having to copy it and have it take up additional space.

So you make a file (foo) 2 days ago. You change it each day. Today, the change you made was bad, and you want to restore the version from yesterday.

before BRT: you copied the file from the snapshot back to the live filesystem, and it took up all new space.

after BRT: we reference the existing blocks in the snapshot, so the copy to the live filesystem takes no additional space on disk. A small BRT entry is maintained in memory (and on disk).

If you remove the snapshot, the BRT entry is removed, and the file remains intact. No long term overhead.

Crontab · a year ago

This reminds me of APFS's Clone files:

https://eclecticlight.co/2024/03/20/apfs-files-and-clones/

It is, except it lets you do it as a sub-file level. You can clone a (block aligned) byte range of a file using the copy_file_range() syscall

awiesenhofer · a year ago

Oh wow, this will be a gamechanger on rsnapshot, restic, etc. Basically every backup workload - can't wait!

cb5r · a year ago

JackSlateur · a year ago

tldr: zfs implemented reflinking (in their own way, not through VFS, sadly)

There are advantages to doing the cloning at the block level, rather than the VFS layer. The feature was originally written for FreeBSD using the copy_file_range() syscall, then extended to work with the existing interfaces in Linux from btrfs.

burnte · a year ago

New account, no comments, two submissions to klarasystems.com. Seems like a marketing account.

PaulCarrack · a year ago

Perhaps, but still an informative article. Historically, Rob Norris and Klara Systems have contribute a good deal of features and needed bug fixes into OpenZFS.

I have no doubt, I was just pointing out the account seems fishy.