Hi! Author of Eg-walker & ShareJS here, both of which are referenced by this post.
You write this article like you're disagreeing with me - but I agree completely with what you've said. I've been saying so on HN for years. (Eg, in this comment from 6 years ago[1].)
The way I think about it, the realtime collaborative tools we use today make a lot of sense when everyone is online & editing together. But when users edit content offline, or in long lived branches, you probably want the option to add conflict markers & do manual review when merging. (Especially for code.)
Luckily, algorithms like egwalker have access to all the information they need to do that. We store character-by-character editing traces from all users. And we store when all changes happened (in causal order, like a git DAG). This is far more information than git has. So it should be very possible to build a CRDT which uses this information to detects & mark conflict ranges when branches are merged. Then we can allow users to manually resolve conflicts.
Algorithmically, this is an interesting problem but it should be quite solvable. Just, for some reason, nobody has worked on this yet. So, thanks for writing this post and bringing more attention to this problem!
If anyone is interested in making a unique and valuable contribution to the field, I'd love to see some work on this. Its an important piece thats missing in the CRDT ecosystem - simply because (as far as I know) nobody has tried to solve it yet. At least not for text editing.
> So it should be very possible to build a CRDT which uses this information to detects & mark conflict ranges when branches are merged. Then we can allow users to manually resolve conflicts.
So you end up with “conflict” as part of your data model. Seems an amusing way of achieving the C (“conflict-free”) of CRDT, though perfectly legitimate and probably the only way.
The next fun challenge is when you get conflicts over conflict resolution!
As crabmusket said [1], I think the C should stand for Commutative, as it apparently did once. Commutative Replicated Data Types, conflict management optional.
Hi Joseph! I am sorry, I was not trying to say your work sucks. I was trying to (1) help practitioners understand what they can expect, and (2) motivate problems like the one you mention at the end.
(1) might seem stupid but I think just evaluating these systems is a challenging enough technical problem that many teams will struggle with it. I just think they deserve practical advice—I know we would have appreciated it earlier on.
No need to apologise or change your article or anything. I think it’s great! It’s true that I haven’t written any articles or blog posts about this problem. People absolutely will appreciate more discussion and awareness of this problem, and I’m delighted you’re getting people talking about it.
I’m motivated by wanting the problem solved, not getting the most praise by random people on the internet. If today that means being cast in the role of “the old guard who’s missing something important” then so be it. What fun.
> Algorithmically, this is an interesting problem but it should be quite solvable. Just, for some reason, nobody has worked on this yet. So, thanks for writing this post and bringing more attention to this problem!
I'm skeptical that an algorithmic solution will be possible, but I can see this being handled in a UX layer built on top. For example, a client could detect that there's been a conflict based on the editing traces, and show a conflict resolution dialog that makes a new edit based on the resolution. The tricky part is marking a conflict as resolved. I suspect it could be as simple as adding a field to the crdt, but maybe then it counts as an algorithmic solution?
The author describes the case of overlapping concurrent splices. It is a known funky corner case, yes.
If we speak of editing program code, the rabbit hole is deeper as we ideally expect a valid program as a result of a merge. There was a project at JetBrains trying to solve this problem through AST-based merge. After delving into the rabbit hole much much deeper, the guys decided it is not worth it. This is what I was told.
I would simply argue that the “offline” editing is a people-problem and hence van not be solved using automation. People shall find a way to break/bypass the automation/system.
The only “offline editing” that I allow on human text documents is having people add comments. So not editing, no automated merging.
For “offline editing” that I allow on automation (source code) is GIT which intentionally does not pretend to solve the merge, it just shows revisions. The merge is an action supervised by humans or specialised automation on a “best guess” effort and still needs reviews and testing to verify success.
Yes I agree. But remember: but git will automatically merge concurrent changes in most cases - since most concurrent changes aren’t in conflict. You’re arguing you want to see & review the merged output anyway - which I agree with.
Ideally, I want to be able to replace git with something that is built on CRDTs. When branches have no conflicts, CRDTs already work fine - since you merge, run tests, and push when you’re happy.
But right now CRDTs are too optimistic. If two branches edit the same line, we do a best-effort merge and move on. Instead in this case, the algorithm should explicitly mark the region as “in conflict” and needing human review, just like git does. Then humans can manually review the marked ranges (or, as others have suggested, ask an llm to do so or something.). And once they’re happy with the result, clear the conflicting range markers and push.
I mentioned Loro in another comment which uses EG Walker, do you think they are an example of what you had mentioned? This comment also seems relevant [0].
Regarding your [1], I had a similar idea and I am beginning to think that only something like an LLM or similar can truly merge conflict free because only they are powerful enough to understand users' intent.
Does Loro generate conflict ranges when merging branches, thus allowing manual conflict resolution?
I’ve heard people suggest using LLMs for years for this - but CRDTs work because the same computation on two peers is guaranteed to produce the same result. LLMs can’t guarantee that. You could probably use an llm in leu of a human manually merging conflicts - but in that case we still need the crdt to first generate those conflict ranges to pass to the llm. Essentially an llm could solve the UX problem, but the underlying algorithm still needs this feature first to allow that to be used.
Mechanical merge algorithms can perform better or worse on different kinds of conflicts (the specific example of editing deleted text is just one of many edge cases) but in the end no CRDT can decide if your merged text is what you mean to say.
We go into a bunch more detail in the Upwelling paper about the differences between (what we call) semantic and syntactic conflicts in writing: https://inkandswitch.com/upwelling/
Ultimately, my feeling is that serious collaboration is a document review problem as much as anything else. That said, this is particularly true in journalism and scientific publishing and can be mostly ignored for your meeting notes...
Anyway, if you see this comment, thanks for a nice piece of writing, Alex. Love to see folks wrestling with these problems.
Hi Peter! Thanks so much for the kind words. I hope you noticed that a lot of the article ends up being a motivation for Ink & Switch's work, which we call out directly at the end. I am a big fan! :)
EDIT: Oh, also I meant to link to Upwelling, but forgot what it was called. I settled for a different link instead because it was deadline.
The other dark side of implementations using CRDTs is the infrastructure load. I wrote about this [0] in depth previously, and Supabase wrote an article [1] a couple of years ago about a CRDT extension for Postgres which I'm happy to discover agrees with my empirical findings.
If you're going to use CRDTs, do yourself a favor and either use Redis or similar (though the amount of memory being consumed is painful to think about), or MyRocks [2] (or anything else based on RocksDB / LevelDB). Whatever you do, do _not_ back it with an RDBMS, and especially not Postgres.
PowerSync has an article on using Postgres with Yjs (and perhaps look into Yrs, the Rust implementation, as well as other Rust crates like Loro and Automerge that are much faster) and they use a table in the database that stores the changes, is that what you are doing too [0]?
The observation of this article is spot on! CRDTs are an awesome formal model for distributed data structures, but I was always bothered by the notion that all conflicts must be resolved automatically (hence also the name, conflict-free replicated data type). As the article illustrated, this is a hopeless endeavor. I believe what is needed is a proper structured representation of conflicts, that allows for sharing them and resolving them collaboratively, giving back control to the users and supporting them in the process. One of my favorite papers “Turning Conflicts into Collaboration” [1] makes a compelling argument for this idea.
As part of my ongoing PhD studies, we have developed our own formal model for structured conflict representation, based on lattice theory: “Lazy Merging: From a Potential of Universes to a Universe of Potentials” [2]. Incidentally, it is also a CRDT, but it does not attempt to resolve conflicts automatically. Instead, it represents them within the collaborative documents. Approaching the problem from mathematics allowed us to arrive at a simple conceptual model that can guarantee strong properties, like, e.g., the completeness, minimality, and uniqueness of merges, even after repeated merges of existing conflicts. And merges can be calculates very easily. I always wanted to make a blog post about it, but I never came around to do it.
Thanks for those citations, they look really interesting!
> hence also the name, conflict-free replicated data type
When I was introduced to CRDTs the acronym stood for commutative replicated data types (eg in the paper by Shapiro et al). I prefer this actually, despite it being harder to pronounce.
A conflict is a complicated idea, and while "conflict free" is a technically correct way of describing the result, it can be misleading as evidenced by this post and your comment.
Commutativity is the property that when Bob applies changes in the order [Bob, Alice] and Alice applies changes in the order [Alice, Bob] that they both end up with the same document. It doesn't imply that the document is somehow "free" of "conflicts" in a sense that may be meaningful at a higher level of abstraction.
I would wager that, in general, supporting the notion that several different entities are all the authority over a piece of data simultaneously and without live coordination is not solvable. This is a learned lesson for distributed systems, and is readily apparent in the article when considering distributed editing of documents. Same goes for dual input in flight cabins, parenting, and probably any other disparate example one can think of.
It is solvable, but needs more complicated contextual information that many people would not want to bother entering : "this word i just changed only makes sense if it is apart of this whole sentence, which is not necessarily required for the whole paragraph..."
And calling this "solvable" is a funny thing to think about, since huge portions of the earth think the chaos output of LLMs could be anywhere near deciding the final output of computation at this point in time
"It is solvable" is equivalent to "politics are not necessary", which I can't agree with.
Agreeing on edits requires a shared context, a shared understanding of the goal and requirements of the text being edited, and a shared projection of how readers should understand the text and how they will understand it instead.
The specific contextual information required for automated editing is dependent on the combined situation of all writers, considering their professional, personal and cultural contexts.
Assuming that context can't be made available in a systematized way, the machine will choose an edit that is not guaranteed to match the intentions and expectations of the people involved. Instead, it will just act as adding one more writer to the mix.
heh, I bet no matter what kind of textual explanation you required, I can provide the situation it does not cover.
You say this word is only required if it's a part of this whole sentence? OK, the other edit kept the whole sentence, but changed a single other word in it, which happened to be the subject.
Idiomatic speech and allusion being two important cases. Turning a line into a literary or pop culture reference require pretty close adherence to the exact phrasing. And two edits are unlikely to achieve that.
One challenge is that the algorithms typically used for collaborative text editing (CRDTs and OT) have strict algebraic requirements for what the edit operations do & how they interact. So even if your server is smart enough to process the "Colour" example in a UX-reasonable way, it's very hard to design a corresponding CRDT/OT, for optimistic client-side edits.
You can work around this by not using CRDTs/OT. E.g., the server processes operations in receipt order, applying whatever UX logic it wants; clients use a rebase/prediction strategy to still allow optimistic edits on top (cf. https://doc.replicache.dev/concepts/how-it-works). Doing this for text editing has some challenges, but they're separate from the CRDT/OT challenges discussed here.
I think this happens because the mathematical, or causal, or entropic, notion of conflicts has been conflated with semantic conflicts. In the past I have made the same mistake, though inversely and was adamantly informed that I had no clue what I was talking about :)
Things get way nastier when you start considering trees, e.g. yJS operates on JSON documents. From a UI standpoint (where the UI is showing some shallow level and hasn't been expanded to the deeper level) users could never even see edits that have been deleted.
I think that the class of CRDTs that preserve conflicts (IIRC that is when a register can hold multiple values) hold the most promise. Users should then be presented with those conflicts - and it could even be completely visual. Being able to scrub through history also seems like a viable alternative (allowing the user to figure out how a strange thing happened, or how their changes disappeared).
> Users should then be presented with those conflicts - and it could even be completely visual. Being able to scrub through history also seems like a viable alternative
I think "Git" would be a wonderful name for this type of CRDT.
For all things Git is good at, conflict resolution is most definitely not one of them. There are many ways to slice a conflicting diff into plus and minus parts, and somehow Git usually manages to create conflicts that are the least human readable. We live in an era of widely adopted and universal language servers, yet Git pretends that every change is just a char added or removed. There are many tools out there which do a considerably better job at this, for example by diffing the AST, not the text representation.
I mentioned Loro in another comment, but they actually do conflict resolution on top of Git trees [0]. Jujutsu is also interesting but I'm not sure if they do any conflict resolution [1].
> In 2009, a surprising amount of the discourse was focused on the algorithms git used to automatically merge changes together.
IIRC Torvalds himself was quite pessimistic about what could be achieved with automatic merging. (And in this he was correct.) He said that Git had rejected the idea that a version-control system could, or should attempt to, "solve the merging problem" by finding a sufficiently-smart algorithm which would do the right things automatically.
> Offline editing is a UI/UX problem
True. There are two deeper root causes here:
1) computing's commitment to cargo-culting old solutions and, relatedly,
2) its devotion to the belief "in general 5-lb sacks are nicer to deal with than 10-lb sacks: therefore I should fit my 10 lbs of POL into one 5-lb sack".
The default vision of "text editor" is "Mosaic textarea", "MacWrite" or something in between, so the quest is usually to bolt merging onto something like that with the minimum possible change. Make it a menu item, or a little dialog box with a few options. If there is some kind of GUI support for merging hidden deep in the menus it's a programmer-UI diff-merger horror that barely does the minimum, or a strikethrough-based view which feels treacherous like navigating a ship through a fog. But in fact in text editing with offline collaboration merging, partly-manual merging at that, is a central part of the process and it needs to be central to the design of the editor. Unfortunately MacWrite is a local maximum which isn't easy to get far away from.
For example, often when someone mentions "cargo-culting" and "old solutions", their next words are "stop editing code as text, edit as a syntax tree". But it has the same problem, just replace "character" with "statement"! Bob added a line to "else" branch of "if" statement, Alice deleted the entire statement, along with "else" branch - what is the smart system to do?
I don't think the syntax tree approach is a silver bullet, but it does vastly reduce the kind of annoying things that trip up git merges today. Issues like whitespace conflicts, formatting clashes, reordering of import statements...
Though we seem to have collectively decided it's easier to solve this by forcing everyone to adopt the same auto formatter configs, which, I guess fair enough
[Author here] I am sorry, I think I phrased the automatic merging point confusingly. I was trying to say that when multiple commits change a file, git will attempt to merge them together, but it MUST flag direct conflicts. Sounds like we agree this is the right approach overall though.
You write this article like you're disagreeing with me - but I agree completely with what you've said. I've been saying so on HN for years. (Eg, in this comment from 6 years ago[1].)
The way I think about it, the realtime collaborative tools we use today make a lot of sense when everyone is online & editing together. But when users edit content offline, or in long lived branches, you probably want the option to add conflict markers & do manual review when merging. (Especially for code.)
Luckily, algorithms like egwalker have access to all the information they need to do that. We store character-by-character editing traces from all users. And we store when all changes happened (in causal order, like a git DAG). This is far more information than git has. So it should be very possible to build a CRDT which uses this information to detects & mark conflict ranges when branches are merged. Then we can allow users to manually resolve conflicts.
Algorithmically, this is an interesting problem but it should be quite solvable. Just, for some reason, nobody has worked on this yet. So, thanks for writing this post and bringing more attention to this problem!
If anyone is interested in making a unique and valuable contribution to the field, I'd love to see some work on this. Its an important piece thats missing in the CRDT ecosystem - simply because (as far as I know) nobody has tried to solve it yet. At least not for text editing.
[1] Bottom part of this comment: https://news.ycombinator.com/item?id=19889174
So you end up with “conflict” as part of your data model. Seems an amusing way of achieving the C (“conflict-free”) of CRDT, though perfectly legitimate and probably the only way.
The next fun challenge is when you get conflicts over conflict resolution!
[1] https://news.ycombinator.com/item?id=42349264
(1) might seem stupid but I think just evaluating these systems is a challenging enough technical problem that many teams will struggle with it. I just think they deserve practical advice—I know we would have appreciated it earlier on.
I’m motivated by wanting the problem solved, not getting the most praise by random people on the internet. If today that means being cast in the role of “the old guard who’s missing something important” then so be it. What fun.
I'm skeptical that an algorithmic solution will be possible, but I can see this being handled in a UX layer built on top. For example, a client could detect that there's been a conflict based on the editing traces, and show a conflict resolution dialog that makes a new edit based on the resolution. The tricky part is marking a conflict as resolved. I suspect it could be as simple as adding a field to the crdt, but maybe then it counts as an algorithmic solution?
[1] https://josephg.com/blog/crdts-go-brrr/
> it should be very possible to build a CRDT which uses this information to detects & mark conflict ranges when branches are merged
The only “offline editing” that I allow on human text documents is having people add comments. So not editing, no automated merging.
For “offline editing” that I allow on automation (source code) is GIT which intentionally does not pretend to solve the merge, it just shows revisions. The merge is an action supervised by humans or specialised automation on a “best guess” effort and still needs reviews and testing to verify success.
Ideally, I want to be able to replace git with something that is built on CRDTs. When branches have no conflicts, CRDTs already work fine - since you merge, run tests, and push when you’re happy.
But right now CRDTs are too optimistic. If two branches edit the same line, we do a best-effort merge and move on. Instead in this case, the algorithm should explicitly mark the region as “in conflict” and needing human review, just like git does. Then humans can manually review the marked ranges (or, as others have suggested, ask an llm to do so or something.). And once they’re happy with the result, clear the conflicting range markers and push.
Regarding your [1], I had a similar idea and I am beginning to think that only something like an LLM or similar can truly merge conflict free because only they are powerful enough to understand users' intent.
[0] https://news.ycombinator.com/item?id=42343953#42344880
I’ve heard people suggest using LLMs for years for this - but CRDTs work because the same computation on two peers is guaranteed to produce the same result. LLMs can’t guarantee that. You could probably use an llm in leu of a human manually merging conflicts - but in that case we still need the crdt to first generate those conflict ranges to pass to the llm. Essentially an llm could solve the UX problem, but the underlying algorithm still needs this feature first to allow that to be used.
Dead Comment
We go into a bunch more detail in the Upwelling paper about the differences between (what we call) semantic and syntactic conflicts in writing: https://inkandswitch.com/upwelling/
Ultimately, my feeling is that serious collaboration is a document review problem as much as anything else. That said, this is particularly true in journalism and scientific publishing and can be mostly ignored for your meeting notes...
Anyway, if you see this comment, thanks for a nice piece of writing, Alex. Love to see folks wrestling with these problems.
EDIT: Oh, also I meant to link to Upwelling, but forgot what it was called. I settled for a different link instead because it was deadline.
If you're going to use CRDTs, do yourself a favor and either use Redis or similar (though the amount of memory being consumed is painful to think about), or MyRocks [2] (or anything else based on RocksDB / LevelDB). Whatever you do, do _not_ back it with an RDBMS, and especially not Postgres.
[0]: https://news.ycombinator.com/item?id=40834759
[1]: https://supabase.com/blog/postgres-crdt
[2]: http://myrocks.io
[0] https://www.powersync.com/blog/postgres-and-yjs-crdt-collabo...
As part of my ongoing PhD studies, we have developed our own formal model for structured conflict representation, based on lattice theory: “Lazy Merging: From a Potential of Universes to a Universe of Potentials” [2]. Incidentally, it is also a CRDT, but it does not attempt to resolve conflicts automatically. Instead, it represents them within the collaborative documents. Approaching the problem from mathematics allowed us to arrive at a simple conceptual model that can guarantee strong properties, like, e.g., the completeness, minimality, and uniqueness of merges, even after repeated merges of existing conflicts. And merges can be calculates very easily. I always wanted to make a blog post about it, but I never came around to do it.
[1] https://doi.org/10.1007/s10606-012-9172-4
[2] https://doi.org/10.14279/tuj.eceasst.82.1226
> hence also the name, conflict-free replicated data type
When I was introduced to CRDTs the acronym stood for commutative replicated data types (eg in the paper by Shapiro et al). I prefer this actually, despite it being harder to pronounce.
A conflict is a complicated idea, and while "conflict free" is a technically correct way of describing the result, it can be misleading as evidenced by this post and your comment.
Commutativity is the property that when Bob applies changes in the order [Bob, Alice] and Alice applies changes in the order [Alice, Bob] that they both end up with the same document. It doesn't imply that the document is somehow "free" of "conflicts" in a sense that may be meaningful at a higher level of abstraction.
https://pages.lip6.fr/Marc.Shapiro/papers/RR-7687.pdf: the paper clearly calls it Conflict-free Replicated Data Types, though it does also define the two styles:
• State-based Convergent Replicated Data Type (CvRDT)
• Op-based Commutative Replicated Data Type (CmRDT)
And calling this "solvable" is a funny thing to think about, since huge portions of the earth think the chaos output of LLMs could be anywhere near deciding the final output of computation at this point in time
Agreeing on edits requires a shared context, a shared understanding of the goal and requirements of the text being edited, and a shared projection of how readers should understand the text and how they will understand it instead.
The specific contextual information required for automated editing is dependent on the combined situation of all writers, considering their professional, personal and cultural contexts.
Assuming that context can't be made available in a systematized way, the machine will choose an edit that is not guaranteed to match the intentions and expectations of the people involved. Instead, it will just act as adding one more writer to the mix.
You say this word is only required if it's a part of this whole sentence? OK, the other edit kept the whole sentence, but changed a single other word in it, which happened to be the subject.
1. Consistency
2. Availability
3. Partition tolerance
You can work around this by not using CRDTs/OT. E.g., the server processes operations in receipt order, applying whatever UX logic it wants; clients use a rebase/prediction strategy to still allow optimistic edits on top (cf. https://doc.replicache.dev/concepts/how-it-works). Doing this for text editing has some challenges, but they're separate from the CRDT/OT challenges discussed here.
Things get way nastier when you start considering trees, e.g. yJS operates on JSON documents. From a UI standpoint (where the UI is showing some shallow level and hasn't been expanded to the deeper level) users could never even see edits that have been deleted.
I think that the class of CRDTs that preserve conflicts (IIRC that is when a register can hold multiple values) hold the most promise. Users should then be presented with those conflicts - and it could even be completely visual. Being able to scrub through history also seems like a viable alternative (allowing the user to figure out how a strange thing happened, or how their changes disappeared).
I think "Git" would be a wonderful name for this type of CRDT.
[0] https://loro.dev/blog/v1.0#loro-version-controller
[1] https://github.com/martinvonz/jj
Git is popular, but it's not particularly great at conflict resolution.
IIRC Torvalds himself was quite pessimistic about what could be achieved with automatic merging. (And in this he was correct.) He said that Git had rejected the idea that a version-control system could, or should attempt to, "solve the merging problem" by finding a sufficiently-smart algorithm which would do the right things automatically.
> Offline editing is a UI/UX problem
True. There are two deeper root causes here:
1) computing's commitment to cargo-culting old solutions and, relatedly,
2) its devotion to the belief "in general 5-lb sacks are nicer to deal with than 10-lb sacks: therefore I should fit my 10 lbs of POL into one 5-lb sack".
The default vision of "text editor" is "Mosaic textarea", "MacWrite" or something in between, so the quest is usually to bolt merging onto something like that with the minimum possible change. Make it a menu item, or a little dialog box with a few options. If there is some kind of GUI support for merging hidden deep in the menus it's a programmer-UI diff-merger horror that barely does the minimum, or a strikethrough-based view which feels treacherous like navigating a ship through a fog. But in fact in text editing with offline collaboration merging, partly-manual merging at that, is a central part of the process and it needs to be central to the design of the editor. Unfortunately MacWrite is a local maximum which isn't easy to get far away from.
For example, often when someone mentions "cargo-culting" and "old solutions", their next words are "stop editing code as text, edit as a syntax tree". But it has the same problem, just replace "character" with "statement"! Bob added a line to "else" branch of "if" statement, Alice deleted the entire statement, along with "else" branch - what is the smart system to do?
Though we seem to have collectively decided it's easier to solve this by forcing everyone to adopt the same auto formatter configs, which, I guess fair enough