We "solved" C10K years ago yet we keep reinventing it (2003)

hoppp · 2 months ago

We solved it 2 decades ago but then decided to use javascript on the server ...

mifreewil · 2 months ago

Node.js uses libuv, which implements strategy 2. mentioned on the linked webpage.

"libuv is a multi-platform C library that provides support for asynchronous I/O based on event loops. It supports epoll(4), kqueue(2)"

nine_k · 2 months ago

It's mostly RAM allocated per client. E.g. Postgres is very much limited by this fact in supporting massive numbers of clients. Hence pgbouncer and other kinds of connection pooling which allow a Postgres server to serve many more clients than it has RAM to allow connecting.

If your Node app spends vety little RAM per client, it can indeed service a great many of them.

A PHP script that does little more than checking credentials and invoking sendfile() could be adequate for the case of serving small files described in the article.

marcosdumay · 2 months ago

Except that it wastes 2 or 3 orders of magnitude in performance and polls all the connections from a single OS thread, locking everything if it has to do extra work on any of them.

Picking the correct theoretical architecture can't save you if you bog down on every practical decision.

hinkley · 2 months ago

Libuv now supports io_uring but I’m fuzzy on how broadly nodejs is applying that fact. It seems to be a function by function migration with lots of rollbacks.

wmf · 2 months ago

Node.js is actually pretty good at C10K but it failed at multicore and C10M.

eternityforest · 2 months ago

I'm surprised there's not a lot more work on "backend free" systems.

A lot of apps seem like they could literally all use the same exact backend, if there was a service that just gave you the 20 or so features these kinds of things need.

Pocketbase is pretty close, but not all the way there yet, you still have to handle billing and security yourself, you're not just creating a client that has access to whatever resources the end user paid for and assigned to the app via the host's UI.

xnx · 3 months ago

Yes. Most Saas are a tiny bit of business logic with a substantial helping of design and code framework bloat.

drob518 · 2 months ago

FFS, yep. Sigh.

trueismywork · 3 months ago

With nginx and 256 core Epycs, most single servers can easily do 200k requests per sec. Very few companies have more needs

tempest_ · 2 months ago

This is how I feel about this industries fetishization of "scalability".

A lot of software time is spent making something scalable when in 2025 I can probably run any site the bottom 99% of most visited sites on the internet on a couple machines and < 40k capital.

Animats · 2 months ago

The sites that think they need huge numbers of small network interactions are probably collecting too much detailed data about user interaction. Like capturing cursor movement. That might be worth doing for 1% of users to find hot spots, but capturing it for all of them is wasteful.

A lot of analytic data is like that. If you captured it for 1% of users you'd find out what you needed to know at 1% of the cost.

tbrownaw · 2 months ago

> any site the bottom 99% of most visited sites on the internet

What % is the AWS console, and what counts as "running" it?

oblio · 2 months ago

Raw technical excellence doesn't rake in billions, despite what IT people keep saying.

Otherwise Viaweb would be the shining star of 2025. Instead it's a forgotten footnote on a path to programming with money (VC).

otterdude · 2 months ago

When people talk about a single server they're not talking about one hunk of metal, they're talking about 1 server process.

This article describes the 10k client connection problem, you should be handling 256K clients :)

marcosdumay · 2 months ago

When people talk about a single server they are pretty much talking about either a single physical box with a CPU inside or a VPS using a few processor threads.

When they say "most companies can run in a single server, but do backups" they usually mean the physical kind.

api · 2 months ago

I’m shocked that a 256 core Epyc can’t do millions of requests per second at a minimum. Is it limited by the net connection or is there still this much inefficiency?

zipy124 · 2 months ago

It almost certainly can, even old intel systems with dual CPU 16 core systems could do 4 and a half million a second [1]. At a certain point network/kernel bottlenecks become apparent though, rather than being compute limited.

[1]: https://www.amd.com/content/dam/amd/en/documents/products/et...

tempest_ · 2 months ago

Like anything it really depends on what they are doing, if you wanted to just open and close a connection you might run into bottle necks in other parts of the stack before the CPU tops out but the real point is that yea, a single machine is going to be enough.

otterdude · 2 months ago

256 Processes x 10k clients (per the article) = 256K RPS

dilyevsky · 2 months ago

At the time this was written powerful backend server only had like 4 cores. Linux only started adopting SMP like that same year. Also CPU caches were tiny

Serving less than 1k qps per core is pretty underwhelming today, at such a high core count you'd likely hit OS limitations way before you're bound by hardware

Grosvenor · 2 months ago

Linux had been doing SMP for about 5 years by that point.

But you're right OS resource limitations (file handles, PIDs, etc) would be the real pain for you. One problem after another.

Now, the real question is do you want to spend your engineering time on that? A small cluster running erlang is probably better than a tiny number of finely tuned race-car boxen.

intothemild · 3 months ago

I can't tell if this is sarcasm or not.

They didn't have this kind of compute back when the article was written. Which is the point in the article.

marcosdumay · 2 months ago

The article was written exactly because they had machines capable enough at the time. But the software worked against it on every level.

Maxatar · 2 months ago

I don't see how you could have read the article and come to this conclusion. The first few sentences of the article even go into detail about how a cheap $1200 consumer grade computer should be able to handle 10,000 concurrent connections with ease. It's literally the entire focus of the second paragraph.

2003 might seem like ancient history, but computers back then absolutely could handle 10,000 concurrent connections.

hinkley · 3 months ago

In spring 2005 Azul introduced a 24 core machine tuned for Java. A couple years later they were at 48 and then jumped to an obscene 768 cores which seemed like such an imaginary number at the time that small companies didn’t really poke them to see what the prices were like. Like it was a typo.

trueismywork · 3 months ago

Half serious. I guess what Iwas saying is that it is that kind of science which is still very useful but more to nginx developers themselves. And most users now dont have to worry about this anymore.

Should have prefixed my comment wirh "nowadays"

zahlman · 2 months ago

> And computers are big, too. You can buy a 1000MHz machine with 2 gigabytes of RAM and an 1000Mbit/sec Ethernet card for $1200 or so. Let's see - at 20000 clients, that's 50KHz, 100Kbytes, and 50Kbits/sec per client. It shouldn't take any more horsepower than that to take four kilobytes from the disk and send them to the network once a second for each of twenty thousand clients. (That works out to $0.08 per client, by the way. Those $100/client licensing fees some operating systems charge are starting to look a little heavy!) So hardware is no longer the bottleneck.

It seems to me that there are far fewer problems nowadays with trying to figure out how to serve a tiny bit of data to many people with those kinds of resources, and more problems with understanding how to make a tiny bit of data relevant.

It still absolutely can be. We've just lost touch.

nine_k · 2 months ago

This particular case, with the numbers given, would work as a server for profile pictures, for instance. Or for package signatures. Or for status pages of a bunch of services (generated statically, since the status rarely changes).

Yes, an RPi4 might be adequate to serve 20k of client requests in parallel, without crashing or breaking too much sweat. You usually want to plan for 5%-10% of this load as a norm if you care about tail latency. But a 20K spike should not kill it.

gnabgib · 3 months ago

(2011 / 2003)

Title: The C10K problem

Popular in:

2014 (112 points, 55 comments) https://news.ycombinator.com/item?id=7250432

2007 (13 points, 3 comments) https://news.ycombinator.com/item?id=45603

alwa · 2 months ago

Apparently this refers to making a web server able to serve 10,000 clients simultaneously.

IgorPartola · 2 months ago

It has been long enough that C10K is not in common software engineer vernacular anymore. There was a time when people did not trust async anything. This was also a time when PHP was much more dominant on the web, async database drivers were rare and unreliable, and you had to roll your own thread pools.

senko · 2 months ago

The date (2003) is incorrect. The article itself refers to events from 2009, is listed at the bottom of the page as having been last updated in 2014, with a copyright notice spanning 2018, and a minor correction in 2019.

throwaway29303 · 2 months ago

  The date (2003) is incorrect.

You're right, it's even older than that; it should be (1999).

https://web.archive.org/web/*/https://www.kegel.com/c10k.htm...

petters · 2 months ago

Seems to be some kind of living document. It was refered to as an "oldie" in 2007: https://news.ycombinator.com/item?id=45603

_qua · 2 months ago

I personally think it's more of a https://c25k.com/ time of year.

readthenotes1 · 2 months ago

The internationally famous Unix Network Programming book. An icon, a shibboleth, a cynosure

https://youtu.be/hjjydz40rNI?si=F7aLOSkLqMzgh2-U

(From Wayne's World--how we knew the comedians had smart advisors)