Just buy a few Mac Studios and run them in-house with power supply backup and networking redundancy and you're good to go to serve more than 10k - 100k requests/second which is good enough to serve a million customers. You don't need VMs: a single Mac Studio gets you 2–4x the power of m7i.2xlarge on AWS, and pays for itself within a few months of AWS bills. You can do local AI inference and get Claude Opus-level performance (Kimi K2.5) over a cluster of Mac Studios with Exo.Labs (an unofficial Apple partner). You get free S3-compatible object storage with zero ongoing storage costs with MinIO (yes it's redundant even if you lose a server, and your hosting provider can't hold your data hostage by charging for egress). Postgres runs like a beast and is incredibly easy to setup - you get zero latency DB because it runs on the same machine, has access to lots of RAM and you're not paying per-GB or per-core. Managed databases are a scam. You don't need an Auth provider, just do passkeys yourself. And the great thing about Apple Silicon hardware is that it is amazingly quiet, reliable, and efficient - you can do thing like run headless browsers 3x faster and cheaper than on standard server hardware because of the unified memory and GPU acceleration, so you're not paying for CI/CD compute by-the-minute or headless browsers either.
This entire stack could give you computing power equivalent to a 25k euro/month AWS bill for the cost of electricity (same electricity cost as running a few fridges 24/7) plus about 50k euros one-time to set it up (about 4 Mac Studios). And yes, it's redundant, scalable, and even faster (in terms of per-request latency) than standard AWS/GCP cloud bloat. Not only is it cheaper and you own everything, but your app will work faster because all services are local (DB, Redis cache, SSD, etc.) without any VM overhead, shared cores, or noisy neighbours.
I came in to work Monday morning, showed it off, and inadvertently triggered a firestorm. Later my boss told me not to do that again because it caused havoc with schedules and such.
So I quit and found a better job. Sometimes the new guy can make a better version themselves over the weekend, not because they’re a supergenius, but because they’re not hampered by 47 teams all trying to get their stamp on the project.
(In before “prime example of overconfidence!”: feel free to doubt. It was a CRUD app with a handful of models on a PostgreSQL backend. They were writing a new Python web framework to serve it, complete with their own ORM and forms library and validation library. Not because the existing ones wouldn’t work, mind you, but more out of not realizing that all these problems were already sufficiently solved for their requirements.)
Fortunately it was limited to a small isolated service but I can imagine the damage on the long term if you continue that route on what becomes a huge monolith after a few years.
Care about orders of magnitude instead, in combination with the speed of hardware https://gist.github.com/jboner/2841832 you'll have a good understanding of how much overhead is due to the language and the constructs to favor for speed improvements.
Just reading the code should give you a sense of its speed and where it will spend most time. Combined with general timing metrics you can also have a sense of the overhead of 3rd party libraries (pydantic I'm looking at you).
So yeah, I find that list quite useful during the code design, likely reduce time profiling slow code in prod.
3 times.
This is the naive version of that code, because "I will parallelize it later" and I was just getting the logic down.
Turns out, when you use programming languages that are fit for purpose, you don't have to obsess over every function call, because computers are fast.
I think people vastly underestimate how slow python is.
We are rebuilding an internal service in Java, going from python, and our half assed first attempts are over ten times faster, no engineering required, exactly because python takes forever just to call a function. The python version was dead, and would never get any faster without radical rebuilds and massive changes anyway.
It takes python 19ns to add two integers. Your CPU does it in about 0.3 ns..... in 2004.
That those ints take 28 bytes each to hold in memory is probably why the new Java version of the service takes 1/10th the memory as well.
Python’s issue is that it is incredibly slow in use cases that surprise average developers. It is incredibly slow at very basic stuff, like calling a function or accessing a dictionary.
If Python didn’t have such an enormous number of popular C and C++ based libraries it would not be here. It was saved by Numpy etc etc.
The seek time of a consumer-grade hard disk is said to be on the order of 10 ms. That's roughly the latency of a very high quality FTTH connection. Meaning that if you run a HDD rather than an SSD, a swap file in the cloud could potentially be faster than a local one (especially when you consider multiple reads/writes that could be done in parallel).
It's not exactly downloading more RAM but decently close to call it that for the joke.
> This is mind-bending stuff, but it becomes intuitive with a bit of practice.
The problem is not the language, it's just that you did not spend enough time to learn it the proper way.