> This has massive implications. SEC means low latency, because nodes don't need to coordinate to handle reads and writes. It means incredible fault tolerance - every single node in the system bar one could simultaneously crash, and reads and writes could still happen normally. And it means nodes still function properly if they're offline or split from the network for arbitrary time periods.
Well, this all depends on the definition of «function properly». Convergence ensures that everyone observed the same state, not that it’s a useful state. For instance, The Imploding Hashmap is a very easy CRDT to implement. The rule is that when there’s concurrent changes to the same key, the final value becomes null. This gives Strong Eventual Consistency, but isn’t really a very useful data structure. All the data would just disappear!
So yes, CRDT is a massively useful property which we should strive for, but it’s not going to magically solve all the end-user problems.
Yeah; this has been a known thing for at least the 15 years I’ve been working in the collaborative editing space. Strong eventual consistency isn’t enough for a system to be any good. We also need systems to “preserve user intent” - whatever that means.
One simple answer to this problem that works almost all the time is to just have a “conflict” state. If two peers concurrently overwrite the same field with the same value, they can converge by marking the field as having two conflicting values. The next time a read event happens, that’s what the application gets. And the user can decide how the conflict should be resolved.
In live, realtime collaborative editing situations, I think the system just picking something is often fine. The users will see it and fix it if need be. It’s really just when merging long running branches that you can get in hot water. But again, I think a lot of the time, punting to the user is a fine fallback for most applications.
good point. the reality is conflicts should often be handled in the business logic, not in the consensus logic, but not universally. For the former, having the conflict state be the consensus state is ideal, but you do risk polluting your upstream application with a bunch of unnecessary conflict handling for trivial state diffs.
With CRDT, you have local consistency and strong convergence, but no guarantee of semantic convergence (i.e. user intent). I would still hire OP, but I would definitely keep him in the backend and away from UX
So the entire point of the (short) article I wrote was to get people to think outside of the the little box people put CRDTs in: javascript libraries and collaborative editing.
Yet here we are, circling back to collaborative editing...
At this point I think the term "CRDT" has too much baggage and I should probably stop using it, or at least not put it in blog post titles.
I've prototyped something attempting to solve this problem of preserving user intent and maintaining application semantics. See comment here https://news.ycombinator.com/item?id=45180325
He very much leans toward them being hard to use in a sensible way. He has some interesting points about using threshold functions over a CRDT to get deterministic reads (i.e. once you observe the value it doesn't randomly change out from under you). It feels a bit theoretical though, I wish there were examples of using this approach in a practical application.
Why do we even need CRDTs? Why can't we have multi-user editors work like multiplayer video games?
The server has the authoritative state, users submit edits, which are then rejected or applied and the changes pushed to others. The users is always assumed to be online for multiplayer editing. No attempt is made to reconcile independent edits, or long periods of offline behavior.
To prevent data loss, when the user is offline and desyncs, he gets to keep his changes and manually merge them back.
I'm sure this isn't a Google genius worthy implementation and fails in the incredibly realistic scenario where thousands of people are editing the same spreadsheet at the same time, but its simple and fails in predictable ways.
sure, i mean that was how early group editing works, but generally you want to preserve state from both (if we both start typing in the same spot, we both add stuff). Also it prevents any offline editing or high...lag editing really. unlike gaming which needs to be realtime this is much softer.
The big problem with CRDTs IMO is that they make it incredibly easy to break application semantics.
Just a basic example for a task tracker:
* first update sets task cancelled_at and cancellation_reason
* second update wants the task to be in progress, so sets started_at
If code just uses the timestamps to consider the task state, it would not assume the task is cancelled, unexpected since the later user update set it to in progress.
Easy fix, we just add a state field 'PENDING|INPROGRESS|CANCELLED|...'.
Okay, but now you have a task that is in progress, but also has a cancellation timestamp, which seems inconsistent.
The point is:
With CRDTs you have to consider how partial out of order merges affect the state, and make sure your logic is always written in a way so these are handled properly.
That is *not easy*!
I'd love it if someone came up with a framework that allows defining application semantics on top of CRDTs, and have the framework ensure types remain consistent.
Do not separate the state field from its time stamp(s). Use a sum type (“tagged union”) where the time stamps are the payload for a selected state. Make invalid states unrepresentable.
If you want invalid states unrepresentable, and time as a primary key... How do you deal with time regularly becoming non-linear within the realm of computing?
I prototyped exactly such a framework! It's designed to solve exactly the problem you mentioned. It’s a super interesting problem. https://github.com/evelant/synchrotron
The gist is:
* Replicating intentions (actions, immutable function call definitions that advance state) instead of just replicating state.
* Hybrid logical clocks for total ordering.
* Some client side db magic to make action functions deterministic.
This ensures application semantics are always preserved with no special conflict resolution considerations while still having strong eventual consistency. Check out the readme for more info. I haven’t gotten to take it much further beyond an experiment but the approach seems promising.
I've had similar thoughts, but my concern was: if you have idempotent actions, then why not just encode them as actions in a log. Which just brings you to event sourcing, a quite well-known pattern.
If you go that route, then what do you need CRDTs for?
I wonder how does this handle a modify-rename conflict? e.g. there's a file identified by its name `a` and one client renames it to `b` while another client tries to modify the contents of `a`. Once you replay it in this order does the intent of modifying the contents of what was once `a` remain?
I know you can use some unique persistent ids instead of names, but then you get into issues that two clients create two files with the same name: do you allow both or not? What if they initially create it equal? What if they do so but then they modify it to be different?
Don’t you also have to consider this just as much without CRDT? Not saying it isn’t a real issue, but this example could easily be a problem with a more traditional style app - maybe users open the record on their web browser at same time and make different updates, or they update the different timestamp fields directly in a list of tasks.
The big idea behind CRDTs is that a data structures can have replicas synchronizing on a best-effort basis. That is much closer to the physical reality: server here, client there, phones all over the place.
The basic CRDT ideas are actually pretty easy to implement: add some metadata here, keep some history there. The difficulty, for the past 20 years or so, is making the overheads low, and the APIs understandable.
Many projects revolve around some JSON-ish data format that is also a CRDT:
- Automerge https://automerge.org (the most tested one, but feels like legacy at times, the design is ~10yrs old, there are more interesting new ways)
Others are trying to retrofit CRDTs into SQLite or Postgres. IMO, those end up using last-write-wins in most cases. Relational logic steers you that way.
Could you explain more about Automerge being legacy? They recently released a new major version and revamped their algorithm I believe. What is better about your version?
Automerge is based on design decisions from 2014-2017. I remember that epoch and what everybody thought back then. The views have evolved since, but Automerge is a large project, many things have being built on top. It is unrealistic to change any of the basic assumptions at this point.
I may go into the technical details, assuming my hourly rate is respected.
Do people really distinguish "Strong Eventual Consistency" from "Eventual Consistency"? To me, when I say "Eventual Consistency" I alwayes mean "Strong Eventual Consisteny".
(Non-Strong) Eventual Consistency does not guarantee that all replicas converge in a specific time period.
In an eventually consistent system replicas can diverge. A "last write" system can be eventually consistent, but a given point can read differently.
Eg: operations
1) Add "AA" to end of string
2) Split string in middle
Replicas R1 and R2 both have the string "ZZZZ"
If R1 sees operations (1) then (2) it will get
"ZZZZAA", then "ZZZ", "ZAA"
If R2 sees (2) then (1) it will get:
"ZZ", "ZZ", then "ZZAA", "ZZ".
Strong Eventual Consistency doesn't have this problem because the operations have the time vector on them so the replicas know what order to apply them.
I’m not sure I follow. How would this be eventually consistent at all? It looks like the two peers in your example simply have divergent state and will never converge.
You're not describing an eventually consistent system, you're describing a system that diverges. By definition, eventually consistent means that, after some time, all readers across the entire system are guaranteed to find the same values, even if before that time they may see different values.
Any eventually consistent system has to have a strategy for ensuring that all nodes eventually agree on a final value. R1 and R2 need to communicate their respective states, and agree to a single one of them - maybe using timestamps if R2's value is newer, R1 will replace its own value when they communicate), maybe using a quorum (say there is also an R3 which agrees with R1, then R2 will change its value to match the other two), maybe using an explicit priority list (say, R1's value is assumed better than R2's).
I think there are a lot of systems that have a separate node syncing feature, where two nodes can receive updates, apply them to their local replica right away, and only communicate them and reconcile with the other backend nodes at a later time.
Interesting that neither the article nor the comments mention the CALM theorem [0], which gives a framework to explain when coordination-free consistency is possible, and is arguably the big idea behind SEC.
Would this be a suitable ds to distribute node state for caching indices? Let's say two nodes have a set of N (possibly overlapping) keys and I want both to know all keys of each other for request routing (request for n \in N preferably to node with n in local cache).
Well, this all depends on the definition of «function properly». Convergence ensures that everyone observed the same state, not that it’s a useful state. For instance, The Imploding Hashmap is a very easy CRDT to implement. The rule is that when there’s concurrent changes to the same key, the final value becomes null. This gives Strong Eventual Consistency, but isn’t really a very useful data structure. All the data would just disappear!
So yes, CRDT is a massively useful property which we should strive for, but it’s not going to magically solve all the end-user problems.
One simple answer to this problem that works almost all the time is to just have a “conflict” state. If two peers concurrently overwrite the same field with the same value, they can converge by marking the field as having two conflicting values. The next time a read event happens, that’s what the application gets. And the user can decide how the conflict should be resolved.
In live, realtime collaborative editing situations, I think the system just picking something is often fine. The users will see it and fix it if need be. It’s really just when merging long running branches that you can get in hot water. But again, I think a lot of the time, punting to the user is a fine fallback for most applications.
With CRDT, you have local consistency and strong convergence, but no guarantee of semantic convergence (i.e. user intent). I would still hire OP, but I would definitely keep him in the backend and away from UX
Yet here we are, circling back to collaborative editing...
At this point I think the term "CRDT" has too much baggage and I should probably stop using it, or at least not put it in blog post titles.
He very much leans toward them being hard to use in a sensible way. He has some interesting points about using threshold functions over a CRDT to get deterministic reads (i.e. once you observe the value it doesn't randomly change out from under you). It feels a bit theoretical though, I wish there were examples of using this approach in a practical application.
``` fn add(x: num, y: num) = x * y ```
The server has the authoritative state, users submit edits, which are then rejected or applied and the changes pushed to others. The users is always assumed to be online for multiplayer editing. No attempt is made to reconcile independent edits, or long periods of offline behavior.
To prevent data loss, when the user is offline and desyncs, he gets to keep his changes and manually merge them back.
I'm sure this isn't a Google genius worthy implementation and fails in the incredibly realistic scenario where thousands of people are editing the same spreadsheet at the same time, but its simple and fails in predictable ways.
but no you dont need it
Deleted Comment
Just a basic example for a task tracker:
* first update sets task cancelled_at and cancellation_reason
* second update wants the task to be in progress, so sets started_at
If code just uses the timestamps to consider the task state, it would not assume the task is cancelled, unexpected since the later user update set it to in progress.
Easy fix, we just add a state field 'PENDING|INPROGRESS|CANCELLED|...'.
Okay, but now you have a task that is in progress, but also has a cancellation timestamp, which seems inconsistent.
The point is:
With CRDTs you have to consider how partial out of order merges affect the state, and make sure your logic is always written in a way so these are handled properly. That is *not easy*!
I'd love it if someone came up with a framework that allows defining application semantics on top of CRDTs, and have the framework ensure types remain consistent.
The point is that you always have to think about merging behaviour for every piece of state.
The gist is:
* Replicating intentions (actions, immutable function call definitions that advance state) instead of just replicating state.
* Hybrid logical clocks for total ordering.
* Some client side db magic to make action functions deterministic.
This ensures application semantics are always preserved with no special conflict resolution considerations while still having strong eventual consistency. Check out the readme for more info. I haven’t gotten to take it much further beyond an experiment but the approach seems promising.
I've had similar thoughts, but my concern was: if you have idempotent actions, then why not just encode them as actions in a log. Which just brings you to event sourcing, a quite well-known pattern.
If you go that route, then what do you need CRDTs for?
I know you can use some unique persistent ids instead of names, but then you get into issues that two clients create two files with the same name: do you allow both or not? What if they initially create it equal? What if they do so but then they modify it to be different?
Any many CRDT implantations have already solved this for the styled text domain (e.g bold and cursive can be additive but color not etc).
But something user definable would be really useful
The basic CRDT ideas are actually pretty easy to implement: add some metadata here, keep some history there. The difficulty, for the past 20 years or so, is making the overheads low, and the APIs understandable.
Many projects revolve around some JSON-ish data format that is also a CRDT:
- Automerge https://automerge.org (the most tested one, but feels like legacy at times, the design is ~10yrs old, there are more interesting new ways)
- JsonJoy https://jsonjoy.com/
- RDX (mine) https://replicated.wiki/ https://github.com/gritzko/go-rdx/
- Y.js https://yjs.dev/
Others are trying to retrofit CRDTs into SQLite or Postgres. IMO, those end up using last-write-wins in most cases. Relational logic steers you that way.
I may go into the technical details, assuming my hourly rate is respected.
Conflict-free replicated data types (CRDTs) https://en.wikipedia.org/wiki/Conflict-free_replicated_data_...
Do people really distinguish "Strong Eventual Consistency" from "Eventual Consistency"? To me, when I say "Eventual Consistency" I alwayes mean "Strong Eventual Consisteny".
In an eventually consistent system replicas can diverge. A "last write" system can be eventually consistent, but a given point can read differently.
Eg: operations
1) Add "AA" to end of string 2) Split string in middle
Replicas R1 and R2 both have the string "ZZZZ"
If R1 sees operations (1) then (2) it will get "ZZZZAA", then "ZZZ", "ZAA"
If R2 sees (2) then (1) it will get:
"ZZ", "ZZ", then "ZZAA", "ZZ".
Strong Eventual Consistency doesn't have this problem because the operations have the time vector on them so the replicas know what order to apply them.
Any eventually consistent system has to have a strategy for ensuring that all nodes eventually agree on a final value. R1 and R2 need to communicate their respective states, and agree to a single one of them - maybe using timestamps if R2's value is newer, R1 will replace its own value when they communicate), maybe using a quorum (say there is also an R3 which agrees with R1, then R2 will change its value to match the other two), maybe using an explicit priority list (say, R1's value is assumed better than R2's).
[0] https://arxiv.org/abs/1901.01930
https://dsf.berkeley.edu/cs286/papers/crdt-tr2011.pdf
If you ask your cache for a value, it could choose to reply now, with the information that it has - favouring A.
Or it could wait and hope for more accurate information to return to you later, favouring C.
'Cache' seems to imply that it's built for availability purposes.
P.S. I am the author of the project.