It also means the latency due to pagefaults is shifted from inside mmapReader.ReadRecord() (where it would be expected) to wherever in the application the bytes are first accessed, leading to spooky unpreditactable latency spikes in what are otherwise pure functions. That inevitably leads to wild arguments about how bad GC stalls are :-)
An apples to apples comparison should be copying the bytes from the mmap buffer and returning the resulting slice.
You shouldn't replace every single file access with mmap. But when it makes sense, mmap is a big performance win.
> The last couple of weeks I've been working on an HTTP-backed filesystem.
It feels like this is micro optimizations, that are going to get blocked anyway by the whole HTTP cycle anyway.
There is also the benchmark issue:
The enhanced CDB format seems to be focused on a read only benefits, as writes introduced a lot of latency, and issue with mmap. In other words, there is a need to freeze for the mmap, then unfreeze, write for updates, freeze for mmap ...
This cycle introduces overhead, does it not? Has this been benchmarked? Because from what i am seeing, the benefits are mostly in the frozen state (aka read only).
If the data is changed infrequently, why not just use json? No matter how slow it is, if your just going to do http requests for the directory listing, your overhead is not the actual file format.
If this enhanced file format was used as file storage, and you want to be able to fast read files, that is a different matter. Then there are ways around it with keeping "part" files where files 1 ... 1000 are in file.01, 2 ... 2000 in file.02 (thus reducing overhead from the file system). And those are memory mapped for fast reading. And where updates are invalidated files/rewrites (as i do not see any delete/vacume ability in the file format).
So, the actual benefits just for a file directory listing db escapes me.
The HTTP request needs to actually be actioned by the server before it can respond. Reducing the time it takes for the server to do the thing (accessing files) will meaningfully improve overall performance.
Switching out to JSON will meaningfully degrade performance. For no benefit.
ICEs have MANY wear-out-fast parts (where "fast" is relative) requiring lube, and lube itself suggests the risk of frictional degradation.
But on an EV, that's basically the only thing that needs somewhat regular "oil changes". Whereas ICE motors & transmissions also need fluid changes regularly.
- having a lower latency handshake
- avoiding some badly behaved ‘middleware’ boxes between users and servers
- avoiding resetting connections when user up addresses change
- avoiding head of line blocking / the increased cost of many connections ramping up
- avoiding poor congestion control algorithms
- probably other things too
And those are all things about working better with the kind of network situations you tend to see between users (often on mobile devices) and servers. I don’t think QUIC was meant to be fast by reducing OS overhead on sending data, and one should generally expect it to be slower for a long time until operating systems become better optimised for this flow and hardware supports offloading more of the work. If you are Google then presumably you are willing to invest in specialised network cards/drivers/software for that.
Jesus, that's bad. Does anyone know if userspace QUIC implementations are also this slow?
Just to mention a few headaches I've been dealing with over the years: multicast sockets that joins the wrong network adapter interface (due to adapter priorities), losing multicast membership after resume from sleep/hibernate, switches/routers just dropping multicast membership after a while (especially when running in VMs and "enterprise" systems like SUSE Linux and Windows Server, all kinds of socket reuse problems, etc.
I don't even dare to think about how many hours I have wasted on issues listed above. I would never rely on multicast again when developing a new system.
But that said, the application suite, a mission control system for satellites, works great most of the time (typically on small, controlled subnets, using physical installations instead of VMs) and has served us well.
"persecuting genius" is literally what is happening.
What on earth are you using that gets you down to single digits!?
If you're working in AWS, you almost certainly are hitting a router, which is comparably slower. Not to mention you are dealing with virtualized hardware, and you are probably sharing all the switches & routers along your path (if someone else's packet is ahead of yours in the queue, you have to wait).