> The REG algorithm excels with its fast local update speeds and eliminate concerns about tombstone collection in CRDTs. For instance, if an operation has been synchronized across all endpoints, no new operations will occur concurrently with it, allowing it to be safely removed from the history.
If you remove these ops from history, does that remove the ability to time travel (per the home page "An antidote to regret, enabling historical edits traversal") or merge branches? How can we be sure an operation is synchronized?
If dropping these ops is necessary for speed/storage optimization but disables time-travel, is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?
Hi! I invented replayable event graphs. I'm writing a paper at the moment about it, which hopefully should be out in a month or so. Send me a private email and I can mail you the current draft if you like.
> If you remove these ops from history, does that remove the ability to time travel
Yes it does. You also need the ops from history to be able to merge changes. You can only merge changes so long as you have the operations going back to the point at which the fork happened.
> is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?
Absolutely. And this is very practically useful. For example, you could have a web page which loads the current state of a document (just a string. Unlike CRDTs, it needs no additional metadata!). Then if some merge happens while you have the document open, the browser could just fetch the operations from the server back as far as it needs to be able to merge. But in normal operation, none of the historical operations need to be loaded at all.
All this said, with text documents the overhead of just keeping the historical operations is pretty tiny anyway. In my testing using diamond types (same algorithm, different library), storing the entire set of historical operations usually increases the file size by less than 50% compared to just storing the final text string. Its much more efficient on disk than git, and more efficient than other CRDTs like automerge and Yjs. So I think most of the time its easier to just keep the history around and not worry about the complexity.
> If you remove these ops from history, does that remove the ability to time travel (per the home page "An antidote to regret, enabling historical edits traversal") or merge branches?
Yes. But squash can be supported.
> How can we be sure an operation is synchronized?
In Loro, we not only record the real-world timestamp efficiently, similar to Git, but also capture the DAG information. This approach ensures that if an operation (op) is particularly old, it will have many other ops depending on it. By utilizing both pieces of information, we can determine the operations that are likely synced across all peers. For peers like servers, it's feasible to preserve all operations. However, we can remove some operations in scenarios such as opening the document online for the first time.
> If dropping these ops is necessary for speed/storage optimization but disables time-travel, is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?
Yes. This is not supported at the moment, but we hope to implement it before version 1.0.
> if an operation has been synchronized across all endpoints, no new operations will occur concurrently with it, allowing it to be safely removed from the history.
This assumes that the set of endpoints (really, nodes) is both well-known by all other nodes in the network, and stable over time (meaning new nodes will never be added).
Even if this assumption can be made safely (which is not a given) the GC process described here is still an optimization, which would be subverted when even a single node in the network became slow or broken.
It's also basically orthogonal to the concept of "tombstones", which are still required if you want to delete anything from the data structure.
Similar to OT, in certain scenarios, it's sufficient to ensure that only a subset of peers have the complete data, while others don't need the full history. For instance, in real-time collaboration scenarios with a central server, we can, just like OT, allow clients to hold only a shallow clone instead of the complete history. This approach results in minimal overhead for the clients.
It's great work improvising over Peritext using joseph's latest CRDT work. Much needed literature in "applying CRDTs for richtext" space.
But I'm surprised why this one too hasn't focussed a lot on rich-text block elements (like lists, tables & sections) as much as it focussed on text attributes (like bold and italics).
I've gone through the Peritext paper several times, and attempted to implement support for lists/tables/sections myself, but found it to be difficult owing to the limited scope of the original paper. I reached out to Martin Kleppman about this, and this is what he told me:
"Thanks for your message. I've written up a document on how to extend Patreon with nested block elements such as bullet points. It's not properly published yet, but you can find a draft here: https://martinkl.notion.site/Block-elements-in-rich-text-CRD...
My colleagues are currently in the process of integrating this algorithm into Automerge, and creating bindings to the ProseMirror rich text editor."
Personally, I found the document to be very helpful.
I've recently implemented a rich text editor on top of Automerge and Peritext that supports blocks, inline-blocks (or whatever you call them: images, tables, etc.), unaware that this was in the works. The implementation ended up _very_ close to what's described in the linked post.
Coming from Google Wave, I still think of this problem the way we dealt with it in wave. In Wave, a document was a sequence of items + annotations. Items were either characters (collectively making up normal text), an item could be an embedded child item, like a table, image, nested document, etc. Embedded items were "in the document" just like any other document content, and could be inserted or deleted the same as text.
Then annotations were used for formatting, like bolded regions or links. This is how peritext and quill work. There are no "annotation items" (since managing them sounds tedious).
It looks like loro works slightly differently. I'd like to take my own stab at rich text in diamond types soon, since I need it for a project. I think there's a cleaner way to do it - though until I write the code, who knows how it'll turn out.
As others have said, once nice thing about this model is that embedded items can themselves be targets of editing events. For example, you could add a map to google wave and then users could collaboratively add and remove pins to the map.
By "annotation items" - you mean the special "boundary characters" that Loro CRDT uses to mark formatting boundaries right?
Yes, Google docs doesn't use such special annotation characters for bold/italics-like formattings. But it does use special boundary characters for marking "comment" boundaries. These characters simplify a few things from a comment perspective (as they shouldn't merge like formatting options). Zoho Writer uses a similar design for differentiating formatting boundaries vs comment boundaries.
> I'd like to take my own stab at rich text in diamond types soon, since I need it for a project. I think there's a cleaner way to do it - though until I write the code, who knows how it'll turn out.
Curious to take a look at your approach. Do write more on your blog Joseph :)
> But I'm surprised why this one too hasn't focussed a lot on rich-text block elements (like lists, tables & sections)
This is where the real strength of crdts is, IMO. Tree-like structures turn into DAGs once you have multiple edits happening (two users editing a tree node create children with two parents) and are much more interesting than linear data like a string. It's definitely not the most efficient way to store data, but it's incredibly convenient for reconciling versions of pretty complex UI.
A few years ago I threw together a report-writing app for car crash data that let you drop different graphs, maps, visualizations, table excerpts (eg "select top 5 worst roads") that could all reference each other. You make a report for your hometown, and some other team could fork it to see the results for their area.
Data were all update live, even if it was pinned to a specific time, since things could be updated after the fact. That other team could then add in a new section of the report or change the conclusion section, and still get the first team's updates to the shared sections (with an option to revert).
You can do like... most stuff like that, if you want. Slides, tables, drawings, whatever. It's not great for graphs but idk what is good for graphs.
The demo they show uses the Quill rich-text editor, which handles block elements analogously to text attributes: a block's type is determined by the attributes on its trailing newline. E.g., for a block that is part of an ordered list, the newline's format is `{ list: "ordered" }`.
Looks neat! Would there be a way of intercepting state and making 'snapshots' into a more traditional format, like SQL, or even a JSON file?
It sounds like this defaults to the server storing the whole state in their binary format, ditto the client-side portion of it. Nothing wrong with the format, but this is an early project, and nobody wants their data in something that's potentially unstable, or something that might get corrupted.
We are carefully stabilizing our encoding format and will have a clear storage format documentation introduced in version 1.0. I agree that a more transparent format can provide users with a better sense of control, and we will try to create a human-readable format for exporting CRDT data (the kind that includes operation history). As for the application state, Loro already supports direct export in json format.
Slightly off-topic - I don't think real-time collaboration is suitable for text-based formats. I believe collaboration similar to working with git is superior:
1. Fork the text
2. Submit proposal
3. Review
4. Merge/Cancel
EDIT: To slightly expand on this - there are many reasons for this intuition - the main, IMO, is that people like to work on text privately before showing it to people. Also, the mental fear of your text interrupted by someone else. There might be even more reasons.
What's great about CRDTs is that they're really good at both real-time and asynchronous collaboration, unlike the operational transform system used by Google Docs. Asynchronous collaboration is Peritext's major motivation [1]:
> We interviewed eight people who regularly collaborate professionally on documents such as news articles, and several told us that they found real-time collaboration a stressful experience: they felt performative, self-conscious of others witnessing their messy work-in-progress, or irritated when a collaborator acted on suggestions before the editing pass was complete. When doing creative work, they preferred to have space to ideate and experiment in private, sharing their progress only when they are ready to do so.
> With asynchronous collaboration, this is possible: a user may work in isolation on their own copy of a document for a while, without seeing other users’ real-time updates; sometime later, when they are ready to share their work, they can choose to merge it with their collaborators’ edits. Several such copies may exist side-by-side, and some might never be merged (e.g. if the user changed their mind about a set of edits).
So they mention my exact concerns and addressed them, very cool. But their solution in practice doesn't look very different from what we can already achieve with git (apart from seeing your collaborator changes in real-time, which I'm not sure how substantial it is), or am I missing something?
At the end of the day, how will this look to the end user?
> a user may work in isolation on their own copy of a document for a while, without seeing other users’ real-time updates; sometime later, when they are ready to share their work, they can choose to merge it with their collaborators’ edits. Several such copies may exist side-by-side, and some might never be merged (e.g. if the user changed their mind about a set of edits).
Again, this sounds almost exactly like what we already achieve using git, so why do we need CRDTs for that?
Asynchronous editing on a centralized platform only infrequently causes actual conflicts in practice, like multiple users editing the same sentence at the same time. Either the editing is actually sequential in time, or independent parts of the document are being edited. With asynchronous decentralized/offline editing, conflicts become more likely.
Hey, sounds like you'd love what we're building @ Ellipsus — https://ellipsus.com
We started off your same exact assumptions and built a text editor that combines the best of both worlds: you can collaborate in real-time on the same piece; or branch off of it, work on your own and then have it reviewed and merged back into the main branch. That and other features that should make writing together a pleasure.
Reach out if you want to give it a spin, we'd appreciate your feedback!
Git is still great for developers, who are comfortable with the command-line and are willing to learn all of Git's concepts.
But let's be honest, Git isn't the most friendly tool for most computer users, many of whom have never touched any command-line interface, let alone Git. And yes, there's git-GUIs, but I find those are even more complex.
> The REG algorithm excels with its fast local update speeds and eliminate concerns about tombstone collection in CRDTs. For instance, if an operation has been synchronized across all endpoints, no new operations will occur concurrently with it, allowing it to be safely removed from the history.
If you remove these ops from history, does that remove the ability to time travel (per the home page "An antidote to regret, enabling historical edits traversal") or merge branches? How can we be sure an operation is synchronized?
If dropping these ops is necessary for speed/storage optimization but disables time-travel, is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?
> If you remove these ops from history, does that remove the ability to time travel
Yes it does. You also need the ops from history to be able to merge changes. You can only merge changes so long as you have the operations going back to the point at which the fork happened.
> is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?
Absolutely. And this is very practically useful. For example, you could have a web page which loads the current state of a document (just a string. Unlike CRDTs, it needs no additional metadata!). Then if some merge happens while you have the document open, the browser could just fetch the operations from the server back as far as it needs to be able to merge. But in normal operation, none of the historical operations need to be loaded at all.
All this said, with text documents the overhead of just keeping the historical operations is pretty tiny anyway. In my testing using diamond types (same algorithm, different library), storing the entire set of historical operations usually increases the file size by less than 50% compared to just storing the final text string. Its much more efficient on disk than git, and more efficient than other CRDTs like automerge and Yjs. So I think most of the time its easier to just keep the history around and not worry about the complexity.
Yes. But squash can be supported.
> How can we be sure an operation is synchronized?
In Loro, we not only record the real-world timestamp efficiently, similar to Git, but also capture the DAG information. This approach ensures that if an operation (op) is particularly old, it will have many other ops depending on it. By utilizing both pieces of information, we can determine the operations that are likely synced across all peers. For peers like servers, it's feasible to preserve all operations. However, we can remove some operations in scenarios such as opening the document online for the first time.
> If dropping these ops is necessary for speed/storage optimization but disables time-travel, is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?
Yes. This is not supported at the moment, but we hope to implement it before version 1.0.
This assumes that the set of endpoints (really, nodes) is both well-known by all other nodes in the network, and stable over time (meaning new nodes will never be added).
Even if this assumption can be made safely (which is not a given) the GC process described here is still an optimization, which would be subverted when even a single node in the network became slow or broken.
It's also basically orthogonal to the concept of "tombstones", which are still required if you want to delete anything from the data structure.
But I'm surprised why this one too hasn't focussed a lot on rich-text block elements (like lists, tables & sections) as much as it focussed on text attributes (like bold and italics).
"Thanks for your message. I've written up a document on how to extend Patreon with nested block elements such as bullet points. It's not properly published yet, but you can find a draft here: https://martinkl.notion.site/Block-elements-in-rich-text-CRD... My colleagues are currently in the process of integrating this algorithm into Automerge, and creating bindings to the ProseMirror rich text editor."
Personally, I found the document to be very helpful.
Coming from Google Wave, I still think of this problem the way we dealt with it in wave. In Wave, a document was a sequence of items + annotations. Items were either characters (collectively making up normal text), an item could be an embedded child item, like a table, image, nested document, etc. Embedded items were "in the document" just like any other document content, and could be inserted or deleted the same as text.
Then annotations were used for formatting, like bolded regions or links. This is how peritext and quill work. There are no "annotation items" (since managing them sounds tedious).
It looks like loro works slightly differently. I'd like to take my own stab at rich text in diamond types soon, since I need it for a project. I think there's a cleaner way to do it - though until I write the code, who knows how it'll turn out.
As others have said, once nice thing about this model is that embedded items can themselves be targets of editing events. For example, you could add a map to google wave and then users could collaboratively add and remove pins to the map.
Yes, Google docs doesn't use such special annotation characters for bold/italics-like formattings. But it does use special boundary characters for marking "comment" boundaries. These characters simplify a few things from a comment perspective (as they shouldn't merge like formatting options). Zoho Writer uses a similar design for differentiating formatting boundaries vs comment boundaries.
> I'd like to take my own stab at rich text in diamond types soon, since I need it for a project. I think there's a cleaner way to do it - though until I write the code, who knows how it'll turn out.
Curious to take a look at your approach. Do write more on your blog Joseph :)
- Joe Lewis
This is where the real strength of crdts is, IMO. Tree-like structures turn into DAGs once you have multiple edits happening (two users editing a tree node create children with two parents) and are much more interesting than linear data like a string. It's definitely not the most efficient way to store data, but it's incredibly convenient for reconciling versions of pretty complex UI.
A few years ago I threw together a report-writing app for car crash data that let you drop different graphs, maps, visualizations, table excerpts (eg "select top 5 worst roads") that could all reference each other. You make a report for your hometown, and some other team could fork it to see the results for their area.
Data were all update live, even if it was pinned to a specific time, since things could be updated after the fact. That other team could then add in a new section of the report or change the conclusion section, and still get the first team's updates to the shared sections (with an option to revert).
You can do like... most stuff like that, if you want. Slides, tables, drawings, whatever. It's not great for graphs but idk what is good for graphs.
https://quilljs.com/docs/delta/#line-formatting
It sounds like this defaults to the server storing the whole state in their binary format, ditto the client-side portion of it. Nothing wrong with the format, but this is an early project, and nobody wants their data in something that's potentially unstable, or something that might get corrupted.
1. Fork the text
2. Submit proposal
3. Review
4. Merge/Cancel
EDIT: To slightly expand on this - there are many reasons for this intuition - the main, IMO, is that people like to work on text privately before showing it to people. Also, the mental fear of your text interrupted by someone else. There might be even more reasons.
> We interviewed eight people who regularly collaborate professionally on documents such as news articles, and several told us that they found real-time collaboration a stressful experience: they felt performative, self-conscious of others witnessing their messy work-in-progress, or irritated when a collaborator acted on suggestions before the editing pass was complete. When doing creative work, they preferred to have space to ideate and experiment in private, sharing their progress only when they are ready to do so.
> With asynchronous collaboration, this is possible: a user may work in isolation on their own copy of a document for a while, without seeing other users’ real-time updates; sometime later, when they are ready to share their work, they can choose to merge it with their collaborators’ edits. Several such copies may exist side-by-side, and some might never be merged (e.g. if the user changed their mind about a set of edits).
[1]: https://www.inkandswitch.com/peritext/
At the end of the day, how will this look to the end user?
> a user may work in isolation on their own copy of a document for a while, without seeing other users’ real-time updates; sometime later, when they are ready to share their work, they can choose to merge it with their collaborators’ edits. Several such copies may exist side-by-side, and some might never be merged (e.g. if the user changed their mind about a set of edits).
Again, this sounds almost exactly like what we already achieve using git, so why do we need CRDTs for that?
My personal experience of using Google Docs/Sheets strongly disagrees with you.
We started off your same exact assumptions and built a text editor that combines the best of both worlds: you can collaborate in real-time on the same piece; or branch off of it, work on your own and then have it reviewed and merged back into the main branch. That and other features that should make writing together a pleasure.
Reach out if you want to give it a spin, we'd appreciate your feedback!
But let's be honest, Git isn't the most friendly tool for most computer users, many of whom have never touched any command-line interface, let alone Git. And yes, there's git-GUIs, but I find those are even more complex.
> Application error: a client-side exception has occurred (see the browser console for more information).
Dead Comment