I agree broadly with the article’s position but I think locks are more harmful than helpful. When I was a Quip user (2018) it was super frustrating to get locked out of a paragraph because someone’s cursor idled there. Instead just allow LWW overwrites. If users have contention and your sync & presence is fast, they’ll figure it out pretty quick, and at most lose 1-2 keystrokes, or one drag gesture, or one color pick.
Notion is “collaborative” and we don’t use a CRDT for text, it’s all last-write-wins decided by the server. However our LWW texts are individually small - one block/paragraph in size - and adding/moving/removing blocks is intention-preserving if not perfectly convergent.
As the article says, the downside for LWW is that “offline” / async collaboration isn’t so great. That’s why we’re working on switching to CRDT for our texts. If you’re interested in bringing CRDTs to a product with a lot of users, consider joining Notion’s Docs team - https://boards.greenhouse.io/notion/jobs/5602426003 / @jitl on Twitter / jake@makenotion.com
So Last-Write-Wins (LWW) basically _is_ a CRDT, but not in the sense that anyone really expects, because they aren't that useful or intention preserving. Especially if the two writes happen in very quick succession / concurrently.
LWW becomes useful if you can:
a) help humans to see who is doing what on a doc
b) reduce the size of the change that is LWW
As you've said:
> However our LWW texts are individually small - one block/paragraph in size
This is really important, because by reducing the size of the individual element that any single user is updating you can also reduce the chance of conflicts. With a small amount of contextual feedback (like the notion icon next to the block) a lot of the problem cases are just avoided.
Clearly locking and updating the entire document would be terrible, but if you can do it on a small enough scope that others can change other elements, it can work really well.
(If you've worked on the notion things you're describing then I'm sure you know this better than I do, but just spelling it out really clearly.)
You can have a LWW CRDT, but not every LWW is a CRDT. LWW CRDTs generally pick a winner based on causal order which is convergent, the C in CRDT, because every peer receiving the same ops in any order would pick the same winner.
Picking a winner based on wall clock time order (as suggested in the article, and implemented by Notion) is not convergent; if peers used that algorithm to apply ops I they would not converge to the same state. Instead you need an authority (your server) to essentially pick an arbitrary order of ops and clients just have to deal with it.
A practical example is to consider three users (A, B, C) working on a document.
1. A, B, C start online working together.
2. C goes offline
3. A, B stay online, and make 100s of changes.
4. C makes a few changes while offline (not sent to other peers yet)
5. C comes back online and sends their changes to (the server / other peers).
In wall-clock LWW, C's changes will overwrite A & B's changes, even though A & B have done a lot more work.
In a causal ordering CRDT implementing LWW, C's changes will "lose" to A & B's changes, since they were actually made "earlier" in causal ordering.
> Clearly locking and updating the entire document would be terrible, but if you can do it on a small enough scope that others can change other elements, it can work really well.
I'm sure good UX is possible with locks, but I haven't used one yet for document editing. I'm still traumatized from Quip which did per-paragraph (per block?) locking and was really annoying. Eventually they added an unlock button next to the block so anyone could "steal" the lock at will due to all the user complaints, but at that point, why put it in at all?
I feel the same in the paragraph. But in some situation it seems rational
Figma still limits you from editing the component if someone is on it. And Figjam did that too. In my mind, this is a good practice in the realm of collaborative design. Because it will be very messy if a single component obeys the last-write-win rule
Are you considering having a CRDT for each text block individually, or moving to a CRDT for the entire data model for a document? Really curious about the design approach here, especially insofar as there's now an external API that the data models need to service!
It’s Complicated (tm), a hybrid of the two. Text content across multiple CRDT enabled blocks may be connected in a single logical CRDT, but the segment visible in any given block is stored in that block object, which maintains our permission system invariants. We’ll still support moving blocks to different pages, so one page may have a few different CRDTs visually interleaved.
All the edge cases are very interesting, like what happens if I use backspace to join a block of CRDT B into block of CRDT A.
API consumers are unaffected. Likely we’ll support API updates of text as best-effort diff-match-patch of the text on our side applied as CRDT ops.
> everyone’s gonna say “but hey, google docs uses operational transform not CRDTs”.. OK yes, but you are not google.
Well, google docs works not because they somehow figured out how to converge OT edits with as much precision as CRDTs do, but simply because they have a central server which orders edits anyway and don't need true leader-less convergence.
In fact, I agree not many things don't need a CRDT. CRDTs help with mathematical rigidity of convergence when you want true peer-2-peer collaboration which works without any central authority.
However, most apps anyway work on top of a central authority (example SaaS apps) so there is no real reason to accomodate all the compexity that comes with CRDT. They might get far with a simpler OT based model + central server based ordering.
For example even Figma doesn't call its model a 100% pure CRDT. It's a partial, simpler CRDT implemented with an assumption that there's going to be a server that understands ordering.
It's the same with Google Docs. They don't need a CRDT because it's a cloud app after all, which means OT is more convenient with major heavylifting (ordering and conflict handlings) outsourced to the server.
Yeah. Text based OT is pretty simple to implement, too. It’s about 200 lines of code, plus a simple network protocol. It’s fast and relatively straightforward to code up, and unlike text CRDTs it’s pretty fast by default.
I use it as my standard test when trying out a new programming language. It was unexpectedly ugly in go because of go’s lack of enums & unions, and that’s one of the big reasons I never got in to programming Go.
Operational transforms are one of those interesting technologies I’ve wanted to learn by implementing but haven’t made the time yet. I also didn’t realise it could be implemented in that little code.
Can you recommend any learning resources for implementing an OT? (Ideally typescript, python, or rust)
> Yeah. Text based OT is pretty simple to implement, too. It’s about 200 lines of code, plus a simple network protocol.
Just for clarification, while OT for plain-text (linear model) might be simple to implement, OT for typical rich-text editors (like Google Docs) that need to rely on a tree-structured data model is a whole different story: https://ckeditor.com/blog/lessons-learned-from-creating-a-ri....
One of the main points of this article is to "just use locks", which glosses over a lot of technical complications about locking elements within a document. How long is the lock held? Can it be stolen from a user who has gone offline, or is still online but off to lunch, and we _really_ need to make this change before the presentation in an hour? What if the user comes back online and they have changes, but the lock was stolen - how are those changes reconciled to the document?
I am generally in favor of simpler is better, and if there is a way to build a collaborative experience without using CRDTs, then go for it. However, sometimes the cure can be worse than the disease, and solutions like locking may introduce more technical complexity than originally thought.
You're absolutely right, locking is a hard problem. Especially when you get into the edge cases of clients not disconnecting cleanly.
If you've ever used one of those awful early-2000s Microsoft word collaboration systems where you have to check-out the doc, and remember to check it back in, and no one can use it until you've checked it back in... it's awful!
I'm not directly in this team, but one of the teams at my company have been working on this problem. They call it "Spaces", and one of the features solves this component locking problem.
By exploring that locking and unlocking mechanism, you will find that the logical conclusion in the end, when enough complexity and edge cases get covered/fixed as bugs, that it turns into a crude form of "CRDT" (where it's not actually consistent, but merges within reason for 99% of use cases).
For however long the user has a browser focus state on the element seems like a reasonable answer, and submit changes as they are made. However, I don't know how you resolve conflicts of two users simultaneously attempting to acquire a lock.
The server must keep track of the locks, and it can only know about the lock being released if the client tells it. E.g by sending a message that the field is not focused any more. The tricky thing is in the "edge" cases, or really the non-perfect cases, which there are plenty of (as I think GP described).
The server can decide that the client is offline if the server misses expected heartbeat messages from the client. But how often will those be sent and how long grace period will we allow? If it's too short then it will be unreliable on shaky 4G connections, if it's too long then it will be annoying in the other direction.
And that's not considering the "social" problems with locks. I've worked on replacing a system that was lock-based with CRDTs where the lunch scenario from MontagFTB actually was a common occurrence.
In an "ideal" scenario your lock acquisition problem is not hard. Client's just show the UI optimistically and whoever the server decide was first gets the lock. The loosing client throws any state the user created in the short time-frame. Over reliable and fast connections for granular locks, this works fine. But in the real world that's just one of the issues with a lock based approach…
Having used an app with locks like this (Quip) I can tell you it really sucks to have a lock lease >10s after the last edit. The “lock” should be a UI level concept anyways - clients should broadcast “hey I’m taking field X”; if you have two users submit a write+lock request concurrently you’ll need to pick a winner anyways.
> However, I don't know how you resolve conflicts of two users simultaneously attempting to acquire a lock.
It turns out you just have to pick one... This all depends on a source of truth, and when you are there it's easy to pick one, say based on whichever arrived at the network interface first.
I don't think you need a pure CRDT either but I think locking and presence is a bit of an oversimplification.
LWW is a good place to start, and updating the smallest piece of information possible is the right idea in general but there is a lot more nuance to handling complex applications like a spreadsheet (I'm working on one) and whiteboard apps.
Things like reparenting or grouping shapes [1], or updating elements that aren't at the lowest scale like deleting a row or column in a spreadsheet make locking challenging to implement. Do you lock the entire row while I'm moving it? Do you lock the entire group of shapes?
With the exception of text editing, the popular libraries like Yjs don't just give you a perfect CRDT out of the box. You still have to construct your data model in a way that enables small scale updates [2], and CRDT libraries and literature are the best source of thinking for these problems that I've found.
That's true for collaborative experience. Crdts are a mechanism to handle eventual consistency (that's even the preface of the paper). If you assume that said collaborative experience is always online, you don't need them, and "using locks" as you described is probably enough.
If you want a mechanism to handle that eventual consistency, it's probably better to reuse their principles rather than reinventing something that will eventually ressemble Crdts.
You may not need CRDT per-se, but building a collaborative experience is so difficult. I worked on collaborative systems for a bit, and also have read a bit about how Figma and Notion do it (this is a good read: https://www.figma.com/blog/how-figmas-multiplayer-technology...) -- it's still super hard to get right.
> Ever-growing state: for CRDTs to work well they need to keep a record of both what exists, and what has been deleted (so that the deletes aren’t accidentally added back in later). This means that CRDT state will continually expand.
I guess a couple things:
It depends on the CRDT. Some CRDTs grow with the number of replicas and others with the number of events.
State-based CRDTs don't need to keep history and don't need causal ordering of messages, but internal bookkeeping grows with the number of replicas. And for large states (like sets and maps), it can be prohibitive to send the state all over the wire for an idempotent merge.
That's why in practice, people implement Op-based CRDTs, which makes the trade: in order to send small ops over the wire, we now need causal ordering of messages. To make sure we can sync with replicas long offline, we keep as much history so that they can catch up.
There are other variations, such as delta-state based CRDTs that send diffs, and merkle CRDTs, which use merkle data structures to calculate diffs and detect concurrency, which have different growth characteristics.
---
As for a growing state: Is this actually a concern for devs that aren't using CRDTs for collaborative text? I can see that being an issue with the amount of changes that can happen.
But outside of that, lots of data don't grow that fast. We all regularly use Git and it keeps a history of everything. Our disks are huge, and having an immutable record is great for lots of things (providing you can access it).
> Opaque state: ...you’re generally left with an opaque blob of binary encoded data.
Most CRDT libraries take a document-orientated angle. It assumes that you can contain the entire "unit of work", like a document, inside of a CRDT. However, if your data is more relational, it doesn't quite fit. And while there's immutable data in a CRDT, I do wish it was more accessible and queryable. In addition, being a binary blob, it's not exactly composable. I think CRDT libraries should be composable with each other.
I've seen locks used at the CMSes of large news organizations. It's fine, but they all need a mechanism to kick out an editor who has an idle tab left open. For my own small scale CMS, I just wrapped Google Docs and let them handle all the syncing headaches.
Notion is “collaborative” and we don’t use a CRDT for text, it’s all last-write-wins decided by the server. However our LWW texts are individually small - one block/paragraph in size - and adding/moving/removing blocks is intention-preserving if not perfectly convergent.
As the article says, the downside for LWW is that “offline” / async collaboration isn’t so great. That’s why we’re working on switching to CRDT for our texts. If you’re interested in bringing CRDTs to a product with a lot of users, consider joining Notion’s Docs team - https://boards.greenhouse.io/notion/jobs/5602426003 / @jitl on Twitter / jake@makenotion.com
LWW becomes useful if you can:
a) help humans to see who is doing what on a doc
b) reduce the size of the change that is LWW
As you've said:
> However our LWW texts are individually small - one block/paragraph in size
This is really important, because by reducing the size of the individual element that any single user is updating you can also reduce the chance of conflicts. With a small amount of contextual feedback (like the notion icon next to the block) a lot of the problem cases are just avoided.
Clearly locking and updating the entire document would be terrible, but if you can do it on a small enough scope that others can change other elements, it can work really well.
(If you've worked on the notion things you're describing then I'm sure you know this better than I do, but just spelling it out really clearly.)
Picking a winner based on wall clock time order (as suggested in the article, and implemented by Notion) is not convergent; if peers used that algorithm to apply ops I they would not converge to the same state. Instead you need an authority (your server) to essentially pick an arbitrary order of ops and clients just have to deal with it.
A practical example is to consider three users (A, B, C) working on a document.
1. A, B, C start online working together.
2. C goes offline
3. A, B stay online, and make 100s of changes.
4. C makes a few changes while offline (not sent to other peers yet)
5. C comes back online and sends their changes to (the server / other peers).
In wall-clock LWW, C's changes will overwrite A & B's changes, even though A & B have done a lot more work.
In a causal ordering CRDT implementing LWW, C's changes will "lose" to A & B's changes, since they were actually made "earlier" in causal ordering.
> Clearly locking and updating the entire document would be terrible, but if you can do it on a small enough scope that others can change other elements, it can work really well.
I'm sure good UX is possible with locks, but I haven't used one yet for document editing. I'm still traumatized from Quip which did per-paragraph (per block?) locking and was really annoying. Eventually they added an unlock button next to the block so anyone could "steal" the lock at will due to all the user complaints, but at that point, why put it in at all?
If people clobber each others updates by typing at the same time in the same input, at least they can understand what happened.
That’s much better than not being able to do something because someone left their cursor on something and walked away…
Figma still limits you from editing the component if someone is on it. And Figjam did that too. In my mind, this is a good practice in the realm of collaborative design. Because it will be very messy if a single component obeys the last-write-win rule
Deleted Comment
Deleted Comment
All the edge cases are very interesting, like what happens if I use backspace to join a block of CRDT B into block of CRDT A.
API consumers are unaffected. Likely we’ll support API updates of text as best-effort diff-match-patch of the text on our side applied as CRDT ops.
Well, google docs works not because they somehow figured out how to converge OT edits with as much precision as CRDTs do, but simply because they have a central server which orders edits anyway and don't need true leader-less convergence.
In fact, I agree not many things don't need a CRDT. CRDTs help with mathematical rigidity of convergence when you want true peer-2-peer collaboration which works without any central authority.
However, most apps anyway work on top of a central authority (example SaaS apps) so there is no real reason to accomodate all the compexity that comes with CRDT. They might get far with a simpler OT based model + central server based ordering.
For example even Figma doesn't call its model a 100% pure CRDT. It's a partial, simpler CRDT implemented with an assumption that there's going to be a server that understands ordering.
It's the same with Google Docs. They don't need a CRDT because it's a cloud app after all, which means OT is more convenient with major heavylifting (ordering and conflict handlings) outsourced to the server.
I use it as my standard test when trying out a new programming language. It was unexpectedly ugly in go because of go’s lack of enums & unions, and that’s one of the big reasons I never got in to programming Go.
Can you recommend any learning resources for implementing an OT? (Ideally typescript, python, or rust)
Just for clarification, while OT for plain-text (linear model) might be simple to implement, OT for typical rich-text editors (like Google Docs) that need to rely on a tree-structured data model is a whole different story: https://ckeditor.com/blog/lessons-learned-from-creating-a-ri....
I am generally in favor of simpler is better, and if there is a way to build a collaborative experience without using CRDTs, then go for it. However, sometimes the cure can be worse than the disease, and solutions like locking may introduce more technical complexity than originally thought.
If you've ever used one of those awful early-2000s Microsoft word collaboration systems where you have to check-out the doc, and remember to check it back in, and no one can use it until you've checked it back in... it's awful!
I'm not directly in this team, but one of the teams at my company have been working on this problem. They call it "Spaces", and one of the features solves this component locking problem.
https://ably.com/examples/component-locking
It might as well have been CRDT from the get go.
Deleted Comment
For however long the user has a browser focus state on the element seems like a reasonable answer, and submit changes as they are made. However, I don't know how you resolve conflicts of two users simultaneously attempting to acquire a lock.
The server can decide that the client is offline if the server misses expected heartbeat messages from the client. But how often will those be sent and how long grace period will we allow? If it's too short then it will be unreliable on shaky 4G connections, if it's too long then it will be annoying in the other direction.
And that's not considering the "social" problems with locks. I've worked on replacing a system that was lock-based with CRDTs where the lunch scenario from MontagFTB actually was a common occurrence.
In an "ideal" scenario your lock acquisition problem is not hard. Client's just show the UI optimistically and whoever the server decide was first gets the lock. The loosing client throws any state the user created in the short time-frame. Over reliable and fast connections for granular locks, this works fine. But in the real world that's just one of the issues with a lock based approach…
It turns out you just have to pick one... This all depends on a source of truth, and when you are there it's easy to pick one, say based on whichever arrived at the network interface first.
LWW is a good place to start, and updating the smallest piece of information possible is the right idea in general but there is a lot more nuance to handling complex applications like a spreadsheet (I'm working on one) and whiteboard apps.
Things like reparenting or grouping shapes [1], or updating elements that aren't at the lowest scale like deleting a row or column in a spreadsheet make locking challenging to implement. Do you lock the entire row while I'm moving it? Do you lock the entire group of shapes?
With the exception of text editing, the popular libraries like Yjs don't just give you a perfect CRDT out of the box. You still have to construct your data model in a way that enables small scale updates [2], and CRDT libraries and literature are the best source of thinking for these problems that I've found.
[1] https://www.figma.com/blog/how-figmas-multiplayer-technology...
[2] https://mattweidner.com/2022/02/10/collaborative-data-design...
If you want a mechanism to handle that eventual consistency, it's probably better to reuse their principles rather than reinventing something that will eventually ressemble Crdts.
You mentioned "offline first", I think it's probably a good place to pluck that ib https://www.inkandswitch.com/local-first/
This talk by Karri about Linear's "sync engine" is also a good watch: https://www.youtube.com/watch?v=Wo2m3jaJixU.
I guess a couple things:
It depends on the CRDT. Some CRDTs grow with the number of replicas and others with the number of events.
State-based CRDTs don't need to keep history and don't need causal ordering of messages, but internal bookkeeping grows with the number of replicas. And for large states (like sets and maps), it can be prohibitive to send the state all over the wire for an idempotent merge.
That's why in practice, people implement Op-based CRDTs, which makes the trade: in order to send small ops over the wire, we now need causal ordering of messages. To make sure we can sync with replicas long offline, we keep as much history so that they can catch up.
There are other variations, such as delta-state based CRDTs that send diffs, and merkle CRDTs, which use merkle data structures to calculate diffs and detect concurrency, which have different growth characteristics.
---
As for a growing state: Is this actually a concern for devs that aren't using CRDTs for collaborative text? I can see that being an issue with the amount of changes that can happen.
But outside of that, lots of data don't grow that fast. We all regularly use Git and it keeps a history of everything. Our disks are huge, and having an immutable record is great for lots of things (providing you can access it).
> Opaque state: ...you’re generally left with an opaque blob of binary encoded data.
Most CRDT libraries take a document-orientated angle. It assumes that you can contain the entire "unit of work", like a document, inside of a CRDT. However, if your data is more relational, it doesn't quite fit. And while there's immutable data in a CRDT, I do wish it was more accessible and queryable. In addition, being a binary blob, it's not exactly composable. I think CRDT libraries should be composable with each other.