Twirrim (u/Twirrim) - Readit News

Twirrim commented on GMP damaging Zen 5 CPUs? gmplib.org/gmp-zen5... · Posted by u/sequin

mk_stjames · a day ago

This was my first question as well- I thought it had been a long, long time since you could fry a CPU by taking away the heatsink.

As in... what, AMD K6 / early Pentium 4 days was the last time I remember hearing about cpu cooler failing and frying a cpu?

Twirrim · a day ago

It was some time around then. I remember AMD being late to it vs Intel.

Twirrim commented on A bug saved the company weblog.rogueamoeba.com/20... · Posted by u/ingve

ffsm8 · 2 days ago

I see you weren't around when such shareware was the norm.

With a two weeks trial, the effort to reset the app whenever the restriction popped up was minuscule, hence nobody paid money for it.

Twirrim · 2 days ago

This is almost certainly why the bug changed everything.

It's really hard to do time limited trials in any durable fashion, where that encompasses more than a single runtime of the program. Something, somewhere, has to persist some kind of indication to the program as to when that period started, and you can always modify it, nuke it etc.

Twirrim commented on Dynamically patch a Python function's source code at runtime ericmjl.github.io/blog/20... · Posted by u/apwheele

diggan · 4 days ago

> Also, why is every damn post these days somehow framed in an AI context? It's exhausting.

Every 5/10-year segment of my life has somehow had one or two "This is the future"-hypes running concurrently with my life. Previously it was X, now it's Y. And most of the times, everything else is somehow connected to this currently hyped subject, no matter if it's related or not.

The only thing I've found to be helpful is thinking about and changing my perspective and framing about it. I read some article like this which is just tangentially related to AI, but the meat is about something else. So mentally I just ignore the other parts, and frame it in some other way in my head.

Suddenly people can write their articles with attachments to the hyped subject but I don't mind, I'm reading it for other purposes and get other takeaways that are still helpful. A tiny Jedi mind-trick for avoiding that exhaustion :)

Twirrim · 4 days ago

AI, block chain, rust, go, serverless, nosql, ruby on rails..... The list goes on and on :-)

Some of it gets really annoying on the business side, because companies like Gartner jump on the trends, and they have enough influence that businesses have to pay attention. When serverless was a thing, every cloud provider effectively had to add serverless things even if it made zero sense and no customers were asking for it, simply to satisfy Gartner (and their ilk) and be seen as innovating and ahead of the curve. Same thing happened with block chain, and is currently happening with AI.

Twirrim commented on Writing Speed-of-Light Flash Attention for 5090 in CUDA C++ gau-nernst.github.io/fa-5... · Posted by u/dsr12

storus · 5 days ago

Only GPU-poors run Q-GaLore and similar tricks.

Twirrim · 5 days ago

Even the large cloud AI services are focusing on this too, because it drives down the average "cost per query", or whatever you want to call it. For inference, arguably more even than training, the smaller and more efficient they can get it, the better their bottom line.

Twirrim commented on The first Media over QUIC CDN: Cloudflare moq.dev/blog/first-cdn/... · Posted by u/kixelated

barosl · 6 days ago

I tested the demo at https://moq.dev/publish/ and it's buttery as hell. Very impressive. Thanks for the great technology!

Watching the Big Buck Bunny demo at https://moq.dev/watch/?name=bbb on my mobile phone leaves a lot of horizontal black lines. (Strangely, it is OK on my PC despite using the same Wi-Fi network.) Is it due to buffer size? Can I increase it client-side, or should it be done server-side?

Also, thanks for not missing South Kora in your "global" CDN map!

Twirrim · 5 days ago

Chrome on my oneplus ten, I get flickering black lines routinely. The fact they're going from somewhere along the top, down towards the right makes me wonder if it's a refresh artifact maybe? It's sort of like the rolling shutter effect

Twirrim commented on Benchmarks for Golang SQLite Drivers github.com/cvilsmeier/go-... · Posted by u/cvilsmeier

maxmcd · 7 days ago

This library is wild https://github.com/cvilsmeier/sqinn

Sqlite over stdin, to a subprocess, and it's fast!

Twirrim · 7 days ago

It's wild to me that stdin/stdout is apparently significantly faster than using the API in so many cases.

That's the kind of result that makes me wonder if there is something odd with the benchmarking.

Twirrim commented on AWS in 2025: Stuff you think you know that's now wrong lastweekinaws.com/blog/aw... · Posted by u/keithly

kelnos · 8 days ago

> I can't talk about it, but I've yet to see an accurate guess at how Glacier was originally designed.

It feels odd that this is some sort of secret. Why can't you talk about it?

Twirrim · 8 days ago

I signed NDAs. I wish Glacier was more open about their history, because it's honestly interesting, and they have a number of notable innovations in how they approach things.

Twirrim commented on AWS in 2025: Stuff you think you know that's now wrong lastweekinaws.com/blog/aw... · Posted by u/keithly

zbentley · 8 days ago

It was for spreading load out. If someone was managing resources in a bunch of accounts and always defaulted to, say, 1b, AWS randomized what AZs corresponded to what datacenter segments to avoid hot spots.

The canonical AZ naming was provided because, I bet, they realized that the users who needed canonical AZ identifiers were rarely the same users that were causing hot spots via always picking the same AZ.

Twirrim · 8 days ago

Almost everyone went with 1a, every time. It causes significant issues for all sorts of reasons, especially considering the latency target for network connections between data centres in an AD

Twirrim commented on AWS in 2025: Stuff you think you know that's now wrong lastweekinaws.com/blog/aw... · Posted by u/keithly

jp57 · 8 days ago

> Glacier restores are also no longer painfully slow.

I had a theory (based on no evidence I'm aware of except knowing how Amazon operates) that the original Glacier service operated out of an Amazon fulfillment center somewhere. When you put it a request for your data, a picker would go to a shelf, pick up some removable media, take it back, and slot it into a drive in a rack.

This, BTW, is how tape backups on timesharing machines used to work once upon a time. You'd put in a request for a tape and the operator in the machine room would have to go get it from a shelf and mount it on the tape drive.

Twirrim · 8 days ago

I can't talk about it, but I've yet to see an accurate guess at how Glacier was originally designed. I think I'm in safe territory to say Glacier operated out of the same data centers as every other AWS service.

It's been a long time, and features launched since I left make clear some changes have happened, but I'll still tread a little carefully (though no one probably cares there anymore):

One of the most crucial things to do in all walks of engineering and product management is to learn how to manage the customer expectations. If you say customers can only upload 10 images, and then allow them to upload 12, they will come to expect that you will always let them upload 12. Sometimes it's really valuable to manage expectations so that you give yourself space for future changes that you may want to make. It's a lot easier to go from supporting 10 images to 20, than the reverse.

Twirrim commented on Non-Uniform Memory Access (NUMA) is reshaping microservice placement codemia.io/blog/path/NUMA... · Posted by u/signa11

bboreham · 11 days ago

Very detailed and accurate description. The author clearly knows way more than I do, but I would venture a few notes:

1. In the cloud, it can be difficult to know the NUMA characteristics of your VMs. AWS, Google, etc., do not publish it. I found the ‘lscpu’ command helpful.

2. Tools like https://github.com/SoilRos/cpu-latency plot the core-to-core latency on a 2d grid. There are many example visualisations on that page; maybe you can find the chip you are using.

3. If you get to pick VM sizes, pick ones the same size as a NUMA node on the underlying hardware. Eg prefer 64-core m8g.16xlarge over 96-core m8g.24xlarge which will span two nodes.

Twirrim · 10 days ago

At OCI, our VM shapes are all single NUMA node by default. We only relatively recently added support for cross-NUMA instances, precisely because of the complications that NUMA introduces.

There are so many performance quirks, and so much software doesn't account for it yet (in part, I'd bet, because most development environments don't have multiple NUMA domains.)

Here's a fun example we found a few years ago, not sure if work has happened in the upstream kernel since: the Linux page cache wasn't fully NUMA aware, and spans NUMA nodes. Someone at work was specifically looking at NUMA performance, and chose to benchmark databases on different NUMA nodes, trying the client on the same NUMA node, and then cross NUMA node, using numactl to pin. After a bunch of tests it looked like with client and server in NUMA 0 it was appreciably faster than client and server in NUMA 1. After a reboot, and re running tests, it had flipped. NUMA 1 faster than NUMA 0. Eventually they worked out that the fast NUMA was whichever one was benchmarked first after a reboot, and from there figured out that when you ran fresh, the database client library ended up in the page cache in that NUMA domain. So if they benchmarked with server in 0, client in 1, and then benchmarked with server in 0, client in 0, that clients access to the client library ended up reaching across to the page cached version in 1, paying a nice latency penalty over and over. His solution was to run the client in a NUMA pinned docker container so that it was a unique file to the OS.