Readit News logoReadit News
foobazgt · a year ago
This blog post seems to blame GC heavily, but if you look back at their earlier blog post [0], it seems to be more shortcomings in either how they're using Cassandra or how Cassandra handles heavy deletes, or some combination:

"It was at that moment that it became obvious they deleted millions of messages using our API, leaving only 1 message in the channel. If you have been paying attention you might remember how Cassandra handles deletes using tombstones (mentioned in Eventual Consistency). When a user loaded this channel, even though there was only 1 message, Cassandra had to effectively scan millions of message tombstones (generating garbage faster than the JVM could collect it)."

And although the blog post talks about GC tuning, there's mention here [1] that they didn't do much tuning and were actually running on an old version of Cassandra (and presumably JVM) - having just switched over from CMS (!).

  0) https://discord.com/blog/how-discord-stores-billions-of-messages
  1) https://news.ycombinator.com/item?id=33136453

Aeolun · a year ago
But then it’s still nice that they’re using ScyllaDB and now it’s not a concern at all right?

Even if they were using their original solution wrong, I think the solution that cannot use wrong is superior.

ericvolp12 · a year ago
The funny part is ScyllaDB still uses tombstones for deletions, though they do have configurable compaction strategies and iirc Discord uses Scylla's Incremental Compaction Strategy that I suppose solves the specific issue they were dealing with. iirc that compaction strategy will trigger a compaction once a certain threshold of a partition is tombstones and then the table is rebuilt without the tombstoned content (which effectively pauses writes on that specific node and that specific table and partition for the duration of that process). Compacting a massive partition is really expensive. Scylla defaults to warning you that a partition is too large if it has at least 100,000 rows in it. My guess is when they moved to ScyllaDB they also adopted a new strategy for partitioning messages in a channel that keeps partition sizes reasonable so compactions don't take a super long time.
roenxi · a year ago
I don't see anything here that looks untoward. They increased their data storage by 3 orders of magnitude and decided to use a different DB system. Fair enough, maybe they've learned more about the nature of their data.

But that logic isn't sound. When dealing with huge amounts of data there are going to be trade-offs. Picking a system that makes different trade-offs to an existing system is not automatically helpful. Yes you don't have the old problems. However, you are about to discover new problems. There is always something of a gamble around which will be more of a problem to your business.

frr149 · a year ago
What's the problem with Scylla? Honest question, BTW
vips7L · a year ago
> having just switched over from CMS (!)

This is really interesting. CMS was removed in Java 14 after being replaced by G1GC in Java 9. They were probably running an antiquated Java 8 or 11 runtime. So that means that in 2022 they were either running a 4 year old Java 11 runtime or an 8 year old Java 8 runtime. They were really leaving a lot of performance on the table.

gorset · a year ago
They could also have gone the commercial route and gotten Zing with their pauseless GC. It’s been around forever and they even cover Cassandra in their marketing.

https://www.azul.com/technologies/cassandra/

leetrout · a year ago
Needs (2023)

That services layer reminds be of a big, fancy, distributed Varnish Cache... they don't mention caching and they chose the word coalesce so I assume it doesn't do much actual caching. But made me think of Varnish's "grace mode" and it's use to prevent the thundering herd problem (which is where I first heard of 'request coalescing') https://varnish-cache.org/docs/6.1/users-guide/vcl-grace.htm...

Also love to see consistent hashing come up again and again. It's a great piece of duct tape that has proven useful in many similar situations. If you know where something should be then you know where everything is gonna come look for it!

loloquwowndueo · a year ago
Coalescing and “origin shielding” tend to be more common terms for that - I’ve never heard of “grace” until today :)
atombender · a year ago
Varnish does call it coalescing. Grace is used for a specific situation: When a previously cached object has expired, Varnish won't evict it from the cache immediately, but will continue to serve the old content, while sending exactly 1 request to the background to refetch. How long an object can live after expiring is called the grace. The HTTP standard calls this behaviour "stale-while-revalidate".
mnutt · a year ago
Grace mode itself doesn’t prevent thundering herd; varnish coalesces all requests automatically and grace mode is used to increase the likelihood of clients receiving cached (albeit stale) responses.
hinkley · a year ago
Nginx always more businesslike.

    proxy_cache_use_stale updating;

dang · a year ago
Year added above. Thanks!
dorlaor · a year ago
Some additional nuggets by ScyllaDB co-founder: - Discord couldn't complete repair with Cassandra. Not the case with Scylla - Scylla has a lot in common with Cassandra, from a good reason, like the LSM tree, compaction etc. However, Scylla has a unique CPU&IO schedulers which allows us to prioritize the queries over compaction, and defer compaction to the half milisecond where we have enough idle bandwidth. We have plenty of articles about it - Scylla has a new (1.5 years) tombstone_gc=repair - a much safer mode - Scylla's new architecture of Raft and tablets was recently launched and is the next big thing for our users. Watch the cool youtube video of those tablet load balancing
aaptel · a year ago
This whole problem wouldn't exist if we used distributed chat protocols which have been around for over 40 years (IRC). With the added benefit of having an open specification and multiple implementations. No walled gardens.

And if you think IRC is too old for the modern world take a look at matrix or xmpp.

How did we let discord take over is a mystery to me, or rather a tragedy.

rollcat · a year ago
IRC does not store messages, it only relays them to clients. You need an add-on solution to store chat history, something we've been taking for granted for ~30 years.

IRC all but requires using a bouncer to follow a conversation from more than a single device.

IRC does not encrypt messages, only (optionally) the client<->server connection. Without E2EE, you have no privacy against the server/operator, which is an easily targeted SPOF.

Matrix (the protocol) is still in flux, and the implementations are lagging behind the spec. If you're not using Element, you're behind on features and security.

XMPP is (similarly to IRC) relying on optional protocol add-ons for basic things, like E2EE, which clients may or may not support fully or correctly.

I recommend reading these breakdowns by soatok: https://soatok.blog/2024/08/04/against-xmppomemo/ https://soatok.blog/2024/08/14/security-issues-in-matrixs-ol...

2013/Snowden happened 11 years ago. E2EE should by now be considered a basic feature, a commodity, something we should be calling for as relentlessly as we did for HTTPS. (Discord of course does not implement E2EE.)

grishka · a year ago
Truth is, E2EE isn't a "basic thing". It's an add-on feature that most people don't want. It is impossible to have E2EE that doesn't leak into the UX, and most people would rather have a streamlined UX than deal with key management. It is also much more complex to have robust E2EE in a group chat.

The thing that sets E2EE apart from HTTPS is that HTTPS requires nothing from the end user. It just works. And as a site owner, you just set it up once and forget about it.

AnonCoward42 · a year ago
> IRC does not encrypt messages, only (optionally) the client<->server connection. Without E2EE, you have no privacy against the server/operator, which is an easily targeted SPOF.

Same as Discord.

> Matrix (the protocol) is still in flux, and the implementations are lagging behind the spec. If you're not using Element, you're behind on features and security.

Discord also only has one reference client, but for me even with that client Matrix/Element was not as reliable. I still use and like it, but it's not a like for like in that regard.

> XMPP is (similarly to IRC) relying on optional protocol add-ons for basic things, like E2EE, which clients may or may not support fully or correctly.

But if you use current clients like Conversations or Dino or the likes it does work. There is no point in counting the clients that don't support it if these aren't the reference or biggest ones. The problem here is more that it's not meant to be used like Discord in any way. Not for big group chats/channels nor for big voice chats (not even sure this possible).

Zambyte · a year ago
> IRC does not encrypt messages, only (optionally) the client<->server connection. Without E2EE, you have no privacy against the server/operator, which is an easily targeted SPOF.

FWIW this point isn't relevant to the IRC vs Discord discussion, since Discord is also very not E2EE. That said, XMPP my preferred protocol that checks all of the boxes.

crtasm · a year ago
Nothing stopping a server also acting as a bouncer and storing messages: https://ergo.chat/about
timeon · a year ago
> IRC does not encrypt messages

Wasn't SILC later used for this instead of IRC?

voidnap · a year ago
> IRC does not store messages, it only relays them to clients.

Some people consider this a feature and prefer using IRC bouncers to discord.

OMEMO solved encryption for XMPP a decade ago. I haven't seen it on IRC yet though.

Ecoste · a year ago
> How did we let discord take over is a mystery to me, or rather a tragedy.

The fact that you're baffled why discord took over is exactly why it took over. You can't even acknowledge that the user experience is 10x better and it's suitable for a general non-technical audience.

mystified5016 · a year ago
New quest available! Buy nitro for stickers! Buy nitro as a gift! New quest available! New quest available! Restart to update. New quest available! Look at the new emojis you could use with nitro! New update available! New update available again! Third update today! New quest available! Look at these profile decorations you could use with nitro! Boost this server! *NEW QUEST AVAILABLE*
dewey · a year ago
I’m a huge IRC fan and I dislike Discord, but all these other services are way too clunky and IRC is really only usable through IRCCloud that has a relatively okay mobile app these days.

Recently a very technical group I’m part of migrated from Telegram to Matrix and the user experience is just not very good. The apps are buggy, don’t look good, then in the new “Element” app SSO isn’t supported so I can’t use my account with it. There’s lots of paper cuts that are okay for someone like me who likes to figure it out but I’d never try to convince my friends to use it.

nunobrito · a year ago
For telegram refugees then maybe SimpleX is an option, except it has no bots nor other options for clients at the moment.

What I personally use is the nostr protocol through a client like Amethyst or OxChat. Messages and groups can be E2EE private, or you can just use the public groups.

The biggest advantage is that you are joining a bigger community of apps and services built on top of the same protocol, rather than joining some isolated island (again).

high_na_euv · a year ago
>How did we let discord take over is a mystery to me, or rather a tragedy.

Orders of magnitude better product than anything competition had at the time?

doublerabbit · a year ago
> Orders of magnitude better product than anything competition had at the time?

Nah, it just comes down to non-techy folk wanting to play/chat with their friends in a just-work configuration.

Mumble, TeamSpeak were always janky, needed a hosted server. IRC is multiplayer notepad.

Geeks care about E2E, and all that glory but these folks don't. And that's what Discord dishes; as did Y!M, MSN, ICQ, AIM back in the day.

All discord has done is replaced those above as GitHub has replaced SourceForge.

We didn't care if the message were encrypted or not back then. Why do we now?

Krasnol · a year ago
Usability did it.

You download an exe, install it, make an account and it runs. Just like that. Everybody can do it.

There are tons of useful and great software out there. Most of it is not easy for the public. Some (most?) of it doesn't even have an GUI. People rather sell their identity and even pay than suffer through too many hops.

Intralexical · a year ago
Not even a EXE. The web version is feature-complete, so you only need to click a link.
throw16180339 · a year ago
> How did we let discord take over is a mystery to me, or rather a tragedy.

Anyone can set up or join a Discord server. If you give users the choice between a complex open platform and an easy proprietary solution, they will pick the latter every time.

maccard · a year ago
If you want to know why, look at the App Store reviews for discord and tea speak and compare them.

Discord just works.

tannhaeuser · a year ago
There’s no lack of open chat protocols and federated services but those have mostly torpedoed themselves: by usability and discoverability problems, holier–than–you attitudes, and plain nerd attention wars. Such as XMPP (used a lot until around 2010 but easily dragged into the mud because XML and overengineering), Mastodon (saw a surge as twitter was faltering but then seemingly stopped to be everyone‘s darling as its limitations became obvious, among them Mastodon admins taking their audience hostage; also ActivityPub fans going around advertising it for each and everything when RSS is just fine for web sites, damaging news feeds alltogether in the process).

Where spamming, or the systematic exploitation of digital communication by the „ad industry“, was killing it in the past (Usenet, and arguably the web), today there‘s also the problem of being consumed by LLMs to push non-public messaging. Though I‘m not sure the latter is really a concern for many, as developers not only are giving away their code, but their entire activity log/issues and their solutions on github such that they can easily be digested and replaced by coding assistant LLMs, git being a distributed system in the first place.

Terr_ · a year ago
> among them Mastodon admins taking their audience hostage

I was excited first hearing all the "fediverse" stuff, but having to hand over control of your online identity to a particular node forever felt a little bit like "old boss, same as the new boss."

(Yes, I know some folks are working on the identity issue.)

elcomet · a year ago
IRC and distributed protocols un general had a big issue : you loose history every time you disconnect
menaerus · a year ago
In the age we are living this starts to sound more like a feature to me.
Intralexical · a year ago
> How did we let discord take over is a mystery to me, or rather a tragedy.

I think I'm reasonably technically competent, and I also dislike Discord's issues with privacy, data sovereignty, siloing information away from the open web, etc.

But you know what I think whenever I click a Matrix link, or IRC? I just don't want to deal with it. You get a list of apps you've never heard of, some of which may not be feature-complete, some with more than one version, some which are advertised using words like "GNOME", "Rust", "Qt5", and "C++" that have no meaning or relation to actually using them as a chat app, and all of which I guess are different and would need to be tried and learned separately. Then picking and clicking one tries to open an outside program which probably isn't installed and I don't want to install because I don't really know/care what it is. And if at that point, out of the dozen or so app options it showed you, you happened to choose one with a web version like Element, and you figure out you can click the "Continue in your browser" button out of the four or five unexplained buttons that pop up as a result ("XDG-Open", "Cancel", "FlatHub", "Download", and "Continue in Browser")— You get a static screen that shows just enough message history to not be useful, with a confusing UI you can't seem to interact with, hidden behind a login wall that still hasn't really explained what in the Internet tubes you're actually looking at.

E.G.: https://matrix.to/#/#invidious:matrix.org

If you try to Google "What is Matrix"— You get pages about math. So then you Google "What is Matrix chat". And all the results harp on using words like "open network", "decentralised", "protocol", "real-time communication", "open standard", "federated"— Which, again, may be technically interesting if you're into that, but doesn't actually have anything to do with how it directly serves the user as a chat app and how you can use it or sign up for it.

It takes way too many clicks, and you get bombarded with way too much information… To still not end up using the app, and in fact end up more confused than before about what a "Matrix" even is. Let's say you lose 15% of incoming users at each step. That rapidly scares off most of the mainstream, before they've even tried it. Maybe Matrix and Element are great. But it just seems like such an ordeal.

Compare that with Discord. You click a link. And then either you're already in the server, or it has a single text box and a single button you click to funnel you through making an account and joining the server.

It doesn't try to convince you to install a Desktop app until you're already fully using it in the web version. You get clear answers and reasons to use it if you search "What is Discord" or go to the website. It doesn't overwhelm you with options and then hound you with technical explainers that you didn't ask for.

IRC goes the other way in usability. People want voice chat, message history, different channels in the same "server", PM channels, etc.

/rant

weaksauce · a year ago
because the voice chat function is so leaps and bounds better than anything out there and it was primarily used for that to game in real time. the text was an afterthought for gamers.
EGreg · a year ago
philipwhiuk · a year ago
> Own this piece of crypto history

I would argue that the web lost it's way as much with "web3" as with the platforms of web 2.

RadiozRadioz · a year ago
There are loads of comments exactly like OP's, and they always make the mistake of mentioning IRC alongside XMPP and Matrix. Inevitably repliers can't help themselves and spend their replies discussing IRC's unsuitability for modern IM and how it's not federated. When IRC is mentioned, commenters ignore XMPP and Matrix and attack the point in terms of IRC. (Though this thread in particular is better than average).

Matrix and XMPP are the far more appropriate competitors for Discord, we need to steer the conversation toward them. I deliberately never mention IRC when I make these types of comments so people don't latch onto it and ignore everything else I said.

lofaszvanitt · a year ago
Discord wrapped irc in shiny paper.
urza · a year ago
100% !! It's so sad :(
jimkoen · a year ago
My takeaway from this is maybe somewhat different from what the authors intended:

> The last one? Our friend, cassandra-messages. [...] To start with, it’s a big cluster. With trillions of messages and nearly 200 nodes, any migration was going to be an involved effort.

To me, that's a surprisingly small amount of nodes for message storage, given the size of discord. I had honestly expected a much more intricate architecture, engineered towards quick scalability, involving a lot more moving parts. I'm sure the complexity is higher than stated in the article, but it makes me wonder, given that I've been partially responsible for more than 200 physical nodes that did less, how much of modern cloud architecture is over engineered.

romanhn · a year ago
They are talking about 177 database nodes, which is not an indicator of architecture complexity. I assume they have dozens/hundreds of services consisting of multiple highly available nodes each across various geographies.

Having seen a much smaller set of Cassandra nodes used to store billions (rather than trillions) of records, I can say that Cassandra was definitely a total PITA for on-call, and a cause of several major outages.

nicholasjarnold · a year ago
> ...how much of modern cloud architecture is over engineered.

I would wager a good majority of it is. The Stack Overflow architecture[0] sticks out to me in this regard as an example on the other end of the spectrum.

[0] https://news.ycombinator.com/item?id=34950843

hiyer · a year ago
Also bear in mind that they're now doing the same with just 72 nodes.
hiyer · a year ago
Very well-written article. I'm happy for them that part of the solution was switching from Cassandra to drop-in replacement Scylla, rather than having to deal with something entirely different.
dean2432 · a year ago
They make it literally impossible to delete your old messages. It's a privacy nightmare and I wonder why the EU hasn't stepped in.
Intralexical · a year ago
I do think there is a balance to be struck, because directed communication means the recipients of old messages are also stakeholders, such that maintaining a consistent record by default is a fundamental part of the "service" they offer. The message contents are different from e.g. secretly hoovering up click patterns. Matrix had some thoughts when they faced the same questions:

  The key question boils down to whether Matrix should be considered more like email (where people would be horrified if senders could erase their messages from your mail spool), or should it be considered more like Facebook (where people would be horrified if their posts were visible anywhere after they avail themselves of their right to erasure).

  Solving this requires making a judgement call, which we've approached from two directions: firstly, considering what the spirit of the GDPR is actually trying to achieve…
https://matrix.org/blog/2018/05/08/gdpr-compliance-in-matrix...

Xen9 · a year ago
In Discord culture, indeed, users usually share a shit-ton of PII in "introduction" messages from images to specific hobbies to medical information (EG "support" communities).

The problem from GDPR perspective is that Discoed makes it impossible to delete those, since once thet detect your interest in trying to delete any of your accounts' data, they will try to get to "anonymisize" it. Then at least publicly your username isdisconnected from thos messages, but they can still be traced back to specific persons. Now if this also is done server side, then they would be in a situation where you'd either have to go through ton of messages or to bulk delete past messages of all to enforce the GDPR demands of an user wanting their PII deleted.

EU Parliament is not a real Parliament in the sense that ONLY the Comission can propose new laws, and the elected parliament basically just votes on those. Who controls the Comission if not the people? The US State Department. Newsguard and non-Musk US bigtechs including Discord are in the same poli-financial bed of the establishment here. And they are full of previous state department workers.*

Unless there is public outrage, the EU-level bodies at least will probably be owned. But Public opinion is controlled by the cyberpunk establishment that trains their LLMs & targets their campaign ads using that illegal Discord data to get political advantage.

You in my view ought to "worry" about the fact that it's possible there will sooner or later no longer be escape from a permanent establishment, Orwell-style. Goes along with the theme that "cybersecurity" is the United States government level has been "war against hate speech" for years, and of course "hate speech" meaning "censorship of internal and external enemy speech."

Budd Dwyers if I recall correctly shot himself in TV after writing to Biden (???) that under some conditions (that became true), the Department of Justice should have "Justice" removed from its name.

---

Most of this I hold only at 50+% confidence of being broadly correct. Take with lots of salt.

r3d0c · a year ago
incoherant babbling
intelVISA · a year ago
Given the sheer size and extent of the user data collected and processed one imagines the EU is working on a big case... quietly.
robmccoll · a year ago
Cassandra is essentially an append-mostly distributed fault-tolerant hash table. If you need specifically that with high write throughput, it's a good choice. I don't understand why people use it as a database. You run into it's limitations immediately and the pain of trying to use it like a database only gets worse with scale.
LeifCarrotson · a year ago
FTA:

> In Cassandra, reads are more expensive than writes.

This makes it insane as a message store for a chat server to me. It seems appropriate for a logging destination for a distributed system, one where you want lots of clients to dump data but most of the time you don't even need to audit the logs, so the number of reads for a given item is less than one. This is obviously not true for Discord messages.

atombender · a year ago
The sentence makes it sounds like Cassandra and Scylla are slow for writes, which isn't the case at all. It's just that writes require a bit less I/O. Reads are still very fast. If reads were slow, nobody would use Cassandra and Scylla for the purposes that they're being used for.
Squeeeez · a year ago
Not too sure - I would have guessed that most of the messages are written once, read by the constant number of participants (say 1-100 or so) and then they disappear off the screen and are never accessed again, ever. Maybe a few people will scroll or search, or use some custom extension to load and export the history, but very rarely.
mianos · a year ago
All the Casandra documentation and web site say it is a database. You can't blame anyone from getting confused. In my experience, I have never seen a project that started to use it, continue to use it after a year or so it may take a year to run into its limitations before having to replace it, with a database, like Postgres.