About how long do these typically take to execute? Minute, Tens of Minutes, Hours?
My work if very iterative where the feedback loop is only a few minutes long.
All of this is done in a Python environment with usage of Rust for speeding up critical code/computations. (The rust code is delivered as Python modules.)
The work is interesting and different challenges arise when having to process and compute datasets that are updated with 10s of TBs of fresh data daily.
I downvote people when they say they don't know what something is when they could have used a LLM to explain it to them.
Buckets use 16 bytes because sse2 and arm neon SIMD are basically guaranteed.
I was shocked to read on how swiss tables - proclaiming to be the fastest hash tables - work. It's just open-addressing linear probe with no technique to deal with collisions and clustering. Plus the initial version rounded hashes%capacity to bucket size, thus using 4 less bits of hash, leading to even more collisions and clustering. Yet the super fast probe with apparently made it not an issue? Mind-boggling. (Later version allowed to scan from arbitrary position by mirroring first bucket as last.)
Could you explain how that makes it a non-issue? It seems counter-intuitive to me that it solves the problem by just probing faster?
What's real trouble is when the hardware fault is like one of the 16 nic queues stopped, so most connections work, but not all (depends on the hash of the 4-tuple) or some bit in the ram failed and now you're hitting thousands of ECC correctable errors per second and your effective cpu capacity is down to 10% ... the system is now too slow to work properly, but manages to stay connected to dist and still attracts traffic it can't reasonably serve.
But OS/thread hangs are avoidable in my experience. If you run your beam system with very few OS processes, there's no reason for the OS to cause trouble.
But on the topic of a 15ms pause... it's likely that that pause is causally related to cascading pauses, it might be the beginning or the end or the middle... But when one thing slows down, others do too, and some processes can't recover when the backlog gets over a critical threshold which is kind of unknowable without experiencing it. WhatsApp had a couple of hacks to deal with this. A) Our gen_server aggregation framework used our hacky version of priority messages to let the worker determine the age of requests and drop them if they're too old. B) we had a hack to drop all messages in a process's mailbox through the introspection facilities and sometimes we automated that with cron... Very few processes can work through a mailbox with 1 million messages, dropping them all gets to recovery faster. C) we tweaked garbage collection to run less often when the mailbox was very large --- i think this is addressed by off-heap mailboxes now, but when GC looks through the mailbox every so many iterations and the mailbox is very large, it can drive an unrecoverable cycle as eventually GC time limits throughput below accumulation and you'll never catch up. D) we added process stats so we could see accumulation and drain rates and estimate time to drain / or if the process won't drain and built monitoring around that.
What happens to the messages? Do they get processed at a slower rate or on a subsystem that works in the background without having more messages being constantly added? Or do you just nuke them out of orbit and not care? That doesn't seem like a good idea to me since loss of information. Would love to know more about this!
Health Insurance is a benefit of employment.
You chose to unemploy yourself now go pay your own way.
What do you expect after they rejected a 30 percent raise.
They want their pension back and they ain't getting it.
About how long do these typically take to execute? Minute, Tens of Minutes, Hours?
My work if very iterative where the feedback loop is only a few minutes long.