5090: 210 TF / $2k == 105 TF/$k
B200: 2250 TF / $40k == 56 TF/$k
Getting only 2x the FLOPs per dollar probably isn't worth the hassle of having to rack 10x as many GPUs, while having no NVLink.
Depends on the workload.
Normally you would go read() -> write() so:
1. Disk -> page cache (DMA)
2. Kernel -> user copy (read)
3. User -> kernel copy (write)
4. Kernel -> NIC (DMA)
sendfile():
1. Disk -> page cache (DMA)
No user space copies, kernel wires those pages straight to the socket
2. Kernel -> NIC (DMA)
So basically, it eliminates 1-2 memory copies along with the associated cache pollution and memory bandwidth overhead. If you are running high QPS web services where syscall and copy overheads dominate, for example CDNs/static file serving the gains can be really big. Based on my observations this can mean double digit reductions in CPU usage and up to ~2x higher throughput.
It's not blanket good advice for all things.
I honestly wish this paper actually showed what it claims, since it is a significant open problem to understand CoT reasoning relative to the underlying training set.
I replied below in this thread with the specific post, 6 months ago.
> After that, a top goal for us is to unify o-series models and GPT-series models by creating systems that can use all our tools, know when to think for a long time or not, and generally be useful for a very wide range of tasks.
> In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3. We will no longer ship o3 as a standalone model.
Obviously it's not.
Normally they have to fight VPN issues anyway, but having a sovereign state inject your packets is certainly a fun new one.
There are special virtual SIM cards that provide access to services from mainland China, as well as VPNs that function normally without issues. I used both while I was in China.
That comes out to 87 disks a day. Assuming a 7 day retention period (this is on the high side), it’s not unthinkable to have a 600-1800 disk deployment (accounting for replication copies)
Yep. Whole week can be easily stored in 1-2 racks.