https://blog.cloudflare.com/cloudflare-servers-dont-own-ips-...
In summary, the location at which an IP egresses Cloudflare network has nothing to do with the geo-ip mapping of that IP. In some cases the decision on where to egress is optimised for "location closest to the user", but this is also not always true.
And then there is the Internet. Often some country (say Iran) egresses from a totally different place (like Frankfurt) due to geopolitics and just location of cables.
Ok, so you are dealing with a classic - you measure A, but what matters is B. For "load" balancing a decent metric is, well, response time (and jitter).
For data partitioning - I guess number of rows is not the right metric? Change it to number*avg_size or something?
If you can't measure the thing directly, then take a look at stuff like "PID controller". This can be approach as a typical controller loop problem, although in 99% doing PID for software systems is an overkill.
Aren't neither required these days with the "async" like and zero-copy interfaces that are now available (like io_uring, where it's still handled by the kernel), along with the nearly non-existence of single core processors in modern times?
This is very much newbie way of thinking. How do you know? Did you profile it?
It turns out there is surprisingly little dumb zero-copy potential at CF. Most of the stuff is TLS, so stuff needs to go through userspace anyway (kTLS exists, but I failed to actually use it, and what about QUIC).
Most of the cpu is burned on dumb things, like application logic. Turns out data copying and encryption and compression are actually pretty fast. I'm not saying these areas aren't ripe for optimization - but the majority of the cost was historically in much more obvious areas.
- parsing addresses is well defined (try parsing ::1%3)
- since 127.0.0.2 is on loopback, ::2 surely also would be
- interface number on Linux is unique
- unix domain socket names are zero-terminated (abstract are not)
- sin6_flowinfo matters (it doens;t unless you opt-in with setsockopt)
- sin6_scope_id matters (it doesn't unless on site-local range)
(I wonder if scope_id would work on ipv4-mapped-IPv6, but if I remember right I checked and it didn't)
- In ipv4, scope_id doesnt exist (true but it can be achieved by binding to interface)
and so on...
Years ago I tried to document all the quirks I knew about https://idea.popcount.org/2019-12-06-addressing/