_vvhw (u/_vvhw) - Readit News

_vvhw commented on Zig Is Self-Hosted Now, What's Next? kristoff.it/blog/zig-self... · Posted by u/kristoff_it

rychco · 3 years ago

I always enjoy reading about Zig advancements. I haven't developed anything substantial in Zig yet, but I'm very optimistic about the future of the language.

If you're curious to see a large Zig codebase, two significant projects are Bun [1] and TigerBeetle [2].

[1] https://github.com/oven-sh/bun

[2] https://github.com/tigerbeetledb/tigerbeetle

_vvhw · 3 years ago

Joran from TigerBeetle here!

Awesome to hear that you're excited about Zig.

Thanks for sharing the link to our repo also—would love to take you on a 1-on-1 tour of the codebase sometime if you'd be up for that!

_vvhw commented on Zig Is Self-Hosted Now, What's Next? kristoff.it/blog/zig-self... · Posted by u/kristoff_it

the_mitsuhiko · 3 years ago

> I haven't noticed miscompilation yet

I found it shockingly easy to get a miscompilation that it really soured my interest in it. It’s good to hear that in practice that appears to be less of a concern.

_vvhw · 3 years ago

Anecdotally again, but I've been coding in Zig since 2020 and have hit I think 2-3 compiler bugs in all that time?

The first was fixed within 24 hours, in fact just before I reported it. The others had clear TODO error messages in the panic, and there were easy enough workarounds.

_vvhw commented on A database without dynamic memory allocation tigerbeetle.com/blog/a-da... · Posted by u/todsacerdoti

creshal · 3 years ago

parse_addresses is only called once, during init.

_vvhw · 3 years ago

Yes, we're planning also to add a kill switch to the allocator that we switch on if anything allocates after init().

_vvhw commented on A database without dynamic memory allocation tigerbeetle.com/blog/a-da... · Posted by u/todsacerdoti

Twisol · 3 years ago

I think the grandparent was saying that dynamic allocation is a form of optimization, which also makes the code harder to follow. Your anecdote seems exactly in line with their suggestion.

_vvhw · 3 years ago

Ah, missed that, thanks! I've updated the comment.

_vvhw commented on A database without dynamic memory allocation tigerbeetle.com/blog/a-da... · Posted by u/todsacerdoti

throwaway09223 · 3 years ago

It's definitely good practice in production and is often necessary.

The techniques mentioned above will (perhaps surprisingly) not eliminate errors related to OOM, due to the nature of virtual memory. Your program can OOM at runtime even if you malloc all your resources up front, because allocating address space is not the same thing as allocating memory. In fact, memory can be deallocated (swapped out), and then your application may OOM when it tries to access memory it has previously used successfully.

Without looking, I can confidently say that tigerbeetle does in fact dynamically allocate memory -- even if it does not call malloc at runtime.

_vvhw · 3 years ago

We're aware of this, in fact, and do have a plan to address virtual memory. To be fair, it's really the kernel being dynamic here, not TigerBeetle.

_vvhw commented on A database without dynamic memory allocation tigerbeetle.com/blog/a-da... · Posted by u/todsacerdoti

com2kid · 3 years ago

> Allocating all required memory during startup has been idiomatic for database architectures as long as I can remember. The obvious benefits to doing things this way are wide-ranging.

Allocating all required memory is, IMHO, a great practice for almost any type of program.

Heap memory allocators are a type of performance optimization, they allow re-use of existing memory, but only if you are careful to not run out of memory. Heap's of course also allow for more efficient utilization of memory, if some software module isn't using that bit of RAM right now, let another bit of code use it.

But like all performance optimizations, they make code messier, and they also lead to less reliable and harder to follow code.

In the embedded world, code statically allocates all memory it needs up front. This means that code that reads from the external world has a fixed buffer size, and also likely a related max throughput, regardless of system conditions (e.g. lots of free memory laying around).

But, this is also a good thing! It forces programmers to think about their limits up front, and it also forces thinking about error handling when those limits are broken!

Sure it can require creative coding, loading large blobs of binary data requires more work when you stop assuming malloc can return infinite memory, but, hot take here, programs should stop assuming malloc can return infinite memory anyway.

_vvhw · 3 years ago

Static allocation has also made TigerBeetle's code cleaner, by eliminating branching at call sites where before a message might not always have been available. With static allocation, there's no branch because a message is always guaranteed to be available.

It's also made TigerBeetle's code more reliable, because tests can assert that limits are never exceeded. This has detected rare leaks that might otherwise have only been detected in production.

_vvhw commented on A database without dynamic memory allocation tigerbeetle.com/blog/a-da... · Posted by u/todsacerdoti

dktoao · 3 years ago

"Clever" developer limitation like this almost invariably cause developers to come up with even more convoluted solutions to problems than needed. Don't disagree that long functions can be an indication that someone was copy pasting a bunch of stuff around in the codebase, but not always!

_vvhw · 3 years ago

Joran from the TigerBeetle team here.

The limit of 70 lines is actually a slight increase beyond the 60 line limit imposed by NASA's Power of Ten Rules for Safety Critical Software.

In my experience, in every instance where we've refactored an overlong function, the result has almost always been safer.

_vvhw commented on A database without dynamic memory allocation tigerbeetle.com/blog/a-da... · Posted by u/todsacerdoti

the_mitsuhiko · 3 years ago

So you only reuse memory for objects with a checksum? Buffer bleed is scary if exploitable (see heartbleed) and I’m curious how you are protecting against it in practice.

_vvhw · 3 years ago

It's defense-in-depth.

We use what we have available, according to the context: checksums, assertions, hash chains. You can't always use every technique. But anything that can possibly be verified online, we do.

Buffer bleeds also terrify me. In fact, I worked on static analysis tooling to detect zero day buffer bleed exploits in the Zip file format [1].

However, to be clear, the heart of a bleed is a logic error, and therefore even memory safe languages such as JavaScript can be vulnerable.

[1] https://news.ycombinator.com/item?id=31852389