const result: ast.Expression[] = [];
p.expect("(");
while (!p.eof() && !p.at(")")) {
subexpr = expression(p);
assert(p !== undefined); // << here
result.push(subexpr);
if (!p.at(")")) p.expect(",");
}
p.expect(")");
return result;This is resilient parsing --- we are parsing source code with syntax errors, but still want to produce a best-effort syntax tree. Although expression is required by the grammar, the `expression` function might still return nothing if the user typed some garbage there instead of a valid expression.
However, even if we return nothing due to garbage, there are two possible behaviors:
* We can consume no tokens, making a guess that what looks like "garbage" from the perspective of expression parser is actually a start of next larger syntax construct:
``` function f() { let x = foo(1, let not_garbage = 92; } ```
In this example, it would be smart to _not_ consume `let` when parsing `foo(`'s arglist.
* Alternatively, we can consume some tokens, guessing that the user _meant_ to write an expression there
``` function f() { let x = foo(1, /); } ```
In the above example, it would be smart to skip over `/`.
You mentioned, "E.g., in OP, memory is leaked on allocation failures." - Can you clarify a bit more about what you mean there?
const recv_buffers = try ByteArrayPool.init(gpa, config.connections_max, recv_size);
const send_buffers = try ByteArrayPool.init(gpa, config.connections_max, send_size);
if the second try throws, than the memory allocation created by the first try is leaked. Possible fixes:A) clean up individual allocations on failure:
const recv_buffers = try ByteArrayPool.init(gpa, config.connections_max, recv_size);
errdefer recv_buffers.deinit(gpa);
const send_buffers = try ByteArrayPool.init(gpa, config.connections_max, send_size);
errdefer send_buffers.deinit(gpa);
B) ask the caller to pass in an arena instead of gpa to do bulk cleanup (types & code stays the same, but naming & contract changes): const recv_buffers = try ByteArrayPool.init(arena, config.connections_max, recv_size);
const send_buffers = try ByteArrayPool.init(arena, config.connections_max, send_size);
C) declare OOMs to be fatal errors const recv_buffers = ByteArrayPool.init(gpa, config.connections_max, recv_size) catch |err| oom(err);
const send_buffers = ByteArrayPool.init(gpa, config.connections_max, send_size) catch |err| oom(err);
fn oom(_: error.OutOfMemory) noreturn { @panic("oom"); }
You might also be interesting in https://matklad.github.io/2025/12/23/static-allocation-compi..., it's essentially a complimentary article to what @MatthiasPortzel says here https://news.ycombinator.com/item?id=46423691On the other hand, the more empirical, though qualitative, claim made by by matklad in the sibling comment may have something to it.
[1]: In fact, take any C program with UB, compile it, and get a dangerous executable. Now disassemble the executable, and you get an equally dangerous program, yet it doesn't have any UB. UB is problematic, of course, partly because at least in C and C++ it can be hard to spot, but it doesn't, in itself, necessarily make a bug more dangerous. If you look at MITRE's top 25 most dangerous software weaknesses, the top four (in the 2025 list) aren't related to UB in any language (by the way, UAF is #7).
FWIW, I don't find this argument logically sound, in context. This is data aggregated across programming languages, so it could simultaneously be true that, conditioned on using memory unsafe language, you should worry mostly about UB, while, at the same time, UB doesn't matter much in the grand scheme of things, because hardly anyone is using memory-unsafe programming languages.
There were reports from Apple, Google, Microsoft and Mozilla about vulnerabilities in browsers/OS (so, C++ stuff), and I think there UB hovered at between 50% and 80% of all security issues?
And the present discussion does seem overall conditioned on using a manually-memory-managed language :0)
"For fairly pragmatic reasons, then, our coding rules primarily target C and attempt to optimize our ability to more thoroughly check the reliability of critical applications written in C."
A version of this document targeting, say, Ada would look quite different.
Doesn't reusing memory effectively allow for use-after-free, only at the progam level (even with a borrow checker)?
I would say the main effect here is that global allocator often leads to ad-hoc, "shotgun" resource management all other the place, and that's hard to get right in a manually memory managed language. Most Zig code that deals with allocators has resource management bugs (including TigerBeetle's own code at times! Shoutout to https://github.com/radarroark/xit as the only code base I've seen so far where finding such bug wasn't trivial). E.g., in OP, memory is leaked on allocation failures.
But if you manage resources manually, you just can't do that, you are forced to centralize the codepaths that deal with resource acquisition and release, and that drastically reduces the amount of bug prone code. You _could_ apply the same philosophy to allocating code, but static allocation _forces_ you to do that.
The secondary effect is that you tend to just more explicitly think about resources, and more proactively assert application-level invariants. A good example here would be compaction code, which juggles a bunch of blocks, and each block's lifetime is tracked both externally:
* https://github.com/tigerbeetle/tigerbeetle/blob/0baa07d3bee7...
and internally:
* https://github.com/tigerbeetle/tigerbeetle/blob/0baa07d3bee7...
with a bunch of assertions all other the place to triple check that each block is accounted for and is where it is expected to be
https://github.com/tigerbeetle/tigerbeetle/blob/0baa07d3bee7...
I see a weak connection with proofs here. When you are coding with static resources, you generally have to make informal "proofs" that you actually have the resource you are planning to use, and these proofs are materialized as a web of interlocking asserts, and the web works only when it is correct in whole. With global allocation, you can always materialize fresh resources out of thin air, so nothing forces you to do such web-of-proofs.
To more explicitly set the context here: the fact that this works for TigerBeetle of course doesn't mean that this generalizes, _but_, given that we had a disproportionate amount of bugs in small amount of gpa-using code we have, makes me think that there's something more here than just TB's house style.
But why? If you do that you are just taking memory away from other processes. Is there any significant speed improvement over just dynamic allocation?
- Operational predictability --- latencies stay put, the risk of threshing is reduced (_other_ applications on the box can still misbehave, but you are probably using a dedicated box for a key database)
- Forcing function to avoid use-after-free. Zig doesn't have a borrow checker, so you need something else in its place. Static allocation is a large part of TigerBeetle's something else.
- Forcing function to ensure existence of application-level limits. This is tricky to explain, but static allocation is a _consequence_ of everything else being limited. And having everything limited helps ensure smooth operations when the load approaches deployment limit.
- Code simplification. Surprisingly, static allocation is just easier than dynamic. It has the same "anti-soup-of-pointers" property as Rust's borrow checker.
It's baffling that a technique known for 30+ years in the industry have been repackage into "tiger style" or whatever this guru-esque thing this is.
> NASA's Power of Ten — Rules for Developing Safety Critical Code will change the way you code forever. To expand:
* https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/TI...
=> https://tigerbeetle.com/blog/2022-10-12-a-database-without-d...
It's always bad to use O(N) memory if you don't have to. With a FS-backed database, you don't have to. (Whether you're using static allocation or not. I work on a Ruby web-app, and we avoid loading N records into memory at once, using fixed-sized batches instead.) Doing allocation up front is just a very nice way of ensuring you've thought about those limits, and making sure you don't slip up, and avoiding the runtime cost of allocations.
This is totally different from OP's situation, where they're implementing an in-memory database. This means that 1) they've had to impose a limit on the number of kv-pairs they store, and 2) they're paying the cost for all kv-pairs at startup. This is only acceptable if you know you have a fixed upper bound on the number of kv-pairs to store.
As a tiny nit, TigerBeetle isn't _file system_ backed database, we intentionally limit ourselves to a single "file", and can work with a raw block device or partition, without file system involvement.
I am not aware of any "serious" state machine other than accounting one though.