Zig's new LinkedList API (it's time to learn fieldParentPtr)

There are several advantages to intrusive node-based data structures that I haven't seen stated in the article or the comments:

- the same object can move between containers with no allocation and no need for a dedicated complex API

- the same object can be part of multiple containers at once; particularly useful for intrusive binary trees, for indexing data with different criteria.

- the container can be fully polymorphic, no need for all elements to be the same dynamic type.

- no need for complex allocators, you can just store the objects as you see fit.

suspended_state · 5 months ago

I think the main advantage of the intrusive definition is that you can use it to implement the non intrusive one easily, whereas the reverse isn't possible.

NobodyNada · 5 months ago

With a non-intrusive linked list type, can't you define the intrusive version as `Node<void>`?

boomlinde · 5 months ago

> the same object can move between containers with no allocation and no need for a dedicated complex API

This is true also of the previous design.

> no need for complex allocators, you can just store the objects as you see fit.

Same here; the previous design leaves it up to the user.

MrCroxx · 5 months ago

Agreed. Intrusive data structures are good for implementing multi-container data structures with lower allocation overhead. And they are widely used than expected. E.g. LRU is a classic multi-container problem.

paraboul · 5 months ago

> the same object can be part of multiple containers at once

I'm not sure I understand this one. Since the object contains the reference to where it belongs inside a container (e.g. object.node.next) how can it be re-used in multiple containers. Conversely, in a non-intrusive data structure, multiple containers can hold a ref to the same object through an intermediate node object

gnubison · 5 months ago

You add multiple next variables. buffer.next, buffer.next_displayed, etc

Zambyte · 5 months ago

The object can contain multiple intrusive node fields.

grandempire · 5 months ago

Is there any generic implementation which is not intrusive? I expect C++ forward_list to look like

struct Node<T> { Node<T> *next; T x; }

_huayra_ · 5 months ago

At least in C++ land, that is not quite what is referred to as intrusive lists. It's basically the opposite / inside-out of what you wrote:

```C++ struct MyType : Node<MyType> { Node<MyType> next, prev; // rest of data }; ```

I usually reach for Boost.Intrusive when I need this [0].

[0] https://www.boost.org/doc/libs/1_88_0/doc/html/intrusive/usa...

rowanG077 · 5 months ago

All of these sound like grievous sources of bugs to me. Sure it's nice that it's possible but please don't do those things unless there isn't a safer way to do it.

Diggsey · 5 months ago

They are definitely more error-prone usage patterns, and coming from Rust the amount of hidden unsafety is quite unsettling (in addition to the obvious potential for user errors, depending on the aliasing model of your language/compiler, which I suspect Zig hasn't figured out yet, this parent pointer trick could even lead to miscompilations...)

Having said that, intrusive linked lists are quite an important data structure in code which is required to be allocation-free. Kernel code, signal handlers and code which requires consistent performance (eg. audio processing) need to be able to move objects between containers or have objects in multiple containers without allocating.

tauoverpi · 5 months ago

You don't need everything to be generic to have safety as the API provided by the object using the linked list can maintain safety by making it difficult to do the wrong thing and being correct by construction. Of course that doesn't stop a user of a library ignoring the safe way to handle types which use intrusive linked list nodes but that same user is likely going to shoot themselves in the foot regardless even when given a "safer" interface.

Trying to write a safer version of the above without losing the performance / advantages takes you into the realm of ATS2 https://www.cs.bu.edu/~hwxi/atslangweb/ which quickly becomes more difficult to work with than practical (and no, rust can't solve this with it's implementation of resource types as far as I'm aware).

So yes, doing the above makes sense when the problem demands it as allocating memory you don't need when there's a clear way to represent a graph without it just wastes cycles on memory management you didn't even need to do.

immibis · 5 months ago

bugs* ironic

It depends on whether you approach programming from a perspective of writing piles of obviously correct abstraction layers (the total effect of which is to leak so much the solution ends up being wrong anyway), or from the direct unabstracted perspective of actually making the computer do what you want it to do (which does not preclude the solution from using abstractions).

I call them the Uncle Bob vs Casey Muratori school, after Casey's lectures such as "Clean Code, Horrible Performance": https://www.youtube.com/watch?v=tD5NrevFtbU

cheezebubba · 5 months ago

How can you have elements of different types? I don't understand how you would know which type a given node was in?

throwawaymaths · 5 months ago

it was maybe easier to understand when @fieldParentPtr took the type as an argument, but if you look at this:

   var x: *T = @fieldParentPtr("node", node_ptr);

@fieldParentPtr knows that its result must have type pointer T; and then the builtin function "looks" inside of T and notices there is a "node" field. Then it can backtrack the pointer of T given the pointer to the node field, since the layout of T is well-defined at compile time.

So you could have a mixed-type linked list, and at the callsite zig can at compile-time back out the parent type. There obviously needs to be outside logic (and an outside mechanism to track) this with branches for each type that you could be putting into the linked list. And yes, this is more than a little bit type-unsafe because of the potential for type erasure.

You could probably pretty trivially write a typesafe wrapper for this, as suggested in one of the comments here.

This feels like a net negative result. It removes some of the complexity of using generics, but it couples between the data type and the collections it can be indexed by.

What are the benefits of this approach? Is it limited to data alignment, or is it out of a greater desire to remove generics?

whizzter · 5 months ago

Intrusive linked lists as this is called removes unnecessary allocations and traversal (main reason why linked lists have such a horrible reputation), say a hypothetical example where you have a temporary listener object installed that listens to broadcasts on channel X and times out after 5 minutes.

Upon connecting the object is connected to the channel X linked list as well as some other list of objects that are killed at the timestamp 5 minutes in the future.

With an intrusive linked list the link-node resides within the object, the only thing needed when installing is linking it into the 2 lists (this is a few move-operations), an external linked list would _require 2 extra allocations_ for the linked-list nodes.

Broadcasting from X is almost the same since it's a simple traversal, albeit with double the cache pressure since the object and the linked-list node probably lives separately.

The real HORROR comes when disconnecting, if channel X has millions of listeners it could become a real bottleneck to traverse the linked list to find the link-node that connects the actual object since there could be tons of jumping around memory. An intrusive linked list would just disconnect itself if it's doubly-linked.

This is why hashsets/tables,vectors/arraylists,etc are often used in practice (because many "modern" OO languages never added good support for the machinery needed here) making linked lists look quite bad (there is other badness but using non-intrusive linked lists is almost always worse than using something else than a linked list altogether).

Deleted Comment

steventhedev · 5 months ago

The generic version in TFA puts the data type allocated alongside the next pointer - no additional allocation needed. The only functional difference is if the zig compiler is not sufficiently advanced to understand it can reorder the fields (hence the alignment question).

The removal scenario is merely specifying that you are passing in ConnectionListNode instead of a Connection. Although maybe it's a good idea to think about how they compose comparatively.

flohofwoe · 5 months ago

It's not at all unusual for intrusive linked lists though?

On AmigaOS (which is entirely built out of intrusive doubly linked list) the list node is placed at the start of a struct, so the pointer to the node is also the pointer to the linked item. There's also no 'coupling' because the list manipulation functions take pointers to the list node structs, but they don't need to know the 'outer' item struct.

Zig's @fieldParentPtr is more flexible since the node struct can be anywhere in the parent struct and you still can get from the node pointer to the item base pointer (and more importantly, it makes it trivial to link the same item into multiple linked lists).

Joker_vD · 5 months ago

It's not unusual at all, it's just... should it be exposed? I personally prefer having "Node struct with next pointer + inlined generic data" design when it can be encapsulated away since, well, it can be encapsulated away, and the result data layout is the same. And when you expose this design, well, you end with all sorts of disasters, including OOP inheritance (no, really: [0]).

[0] https://catern.com/inheritance.html

codr7 · 4 months ago

This approach is also common in C:

https://github.com/codr7/hacktical-c/tree/main/list

One pretty big advantage is you get some help from the compiler making sure you mean what you say. As opposed to just blindly casting items and assuming no one forgot to put the list in their struct. Also the possibility to be part of multiple lists.

GolDDranks · 5 months ago

I think it matches really well for the Zig ethos of "artisanal, minimal code". More than often, a type is strongly tied to how it is being used in the code base. In that sense, having it be tightly coupled to a data structure isn't much of a problem. The opposite isn't true, and the data structure is independent of the embedding parent data. The implementation of that data structure itself can still be presented as a library, and the "generic" parts carefully tested.

reissbaker · 5 months ago

Zig has no desire to remove comptime AFAIK (comptime being the larger Zig feature by which people implement generics in the language) — comptime is pretty much the reason to use Zig.

The benefits of intrusive linked lists are higher performance; you can use comptime to still have decoupled code.

surajrmal · 5 months ago

The generic approach should be the same performance. This approach just lets you place your data in multiple lists without needing multiple allocations.

kevin_thibedeau · 5 months ago

C++20 has consteval to do the same thing.

Someone · 5 months ago

The only logical explanation I can see is that these are two decisions:

- linked lists aren’t useful on modern systems because traversing them causes to many cache misses. Therefore, we shouldn’t provide such a simple implementation.

- but we need one in low level OS code, and there, intrusive lists are preferred. Their limitation that you cannot store an object in multiple lists isn’t a problem, and the extra speed and the fact that you can move items between lists without allocations is desired.

I don’t see why the intrusive implementation has to be so basic, though. Can’t you, in Zig, express that a generic type T used in a generic list class has to have a nullable next field that points to a T?

throwawaymaths · 4 months ago

> linked lists aren’t useful on modern systems because traversing them causes to many cache misses

Only if you "default to using" (and only use) malloc. Zig encourages you to use different allocators within the same program for different use cases, including, potentially allocators which will thrash the cache far less (for example thread local arenas).

Deleted Comment

yccs27 · 5 months ago

You could create a comptime function that adds the Node field to any type. I guess then you've arrived back at the previous generic version.

pub const SinglyLinkedList(comptime T: type) type { return struct { first: ?*Node = null, pub const Node = struct { next: ?*Node = null, }; const Self = @This(); pub fn data(self: Self) *T { return @fieldParentPtr("node", self); } };

mgaunard · 5 months ago

mastax · 5 months ago

> The new version isn't generic. Rather, you embed the linked list node with your data. This is known as an intrusive linked list and tends to perform better and require fewer allocations.

I don’t get this. In the generic version the data is embedded with the linked list node, so there’s only one allocation per node. ‘T’ isn’t a pointer (or doesn’t have to be) there’s no indirection between data and node.

mppm · 5 months ago

It would probably have been more correct to say "requires fewer allocations in some cases". As you point out, in terms of layout, the old generic version is just as intrusive as the new version, and requires just as many allocations (one). However, the new version gives extra flexibility for you to move a long-lived object in and out of lists without copying or allocating, at the cost of having the pointer field baked into it permanently, rather than on demand.

I think the reasoning behind this change is that (from Zig's perspective), if you are using linked lists you are probably doing something wrong, unless your application requires the above-mentioned kind of juggling, which favors explicitly intrusive LLs. In addition, this change de-generifys the lists's methods, like `prepend`, which reduces code size a little.

At least that's my understanding.

DarkWiiPlayer · 5 months ago

> However, the new version gives extra flexibility for you to move a long-lived object in and out of lists without copying or allocating

You could also do this with the entire node in the case of a generic implementation though, the API just needs to expose the node struct to you and allow you to detach it; but the same is true for this new implementation as well.

In terms of memory, a detached node that embeds the payload struct isn't different from an object with an embedded detached node.

What changes is that now, if you have an object class that you don't want to (or can't) extend to include a list node, you have to wrap it in a container struct that, again, looks the same in memory but now has a node and your object as its members. I'm not sure if this is really much of an improvement at the end of the day.

Also, correct me if I'm wrong (I don't do zig), but shouldn't it be possible to create a list of void elements from the generic implementation and embed its node type inside your object, then proceed from there as if it were the new implementation?

torginus · 5 months ago

I remember reading an article about this technique - it was used in the original Starcraft. The idea here, is that the object needs to be allocated anyway by someone else, so you get the linked list for free.

An object can also be part of multiple lists, each used by a different subsystem with its own concerns.

Deleting an item involves unlinking it from all lists (though that requires a doubly-linked list)

Edit: found the article

https://www.codeofhonor.com/blog/tough-times-on-the-road-to-...

jamiejquinn · 4 months ago

AFAIK it's also a structure heavily used by Nethack 3 to manage its astounding complexity. Modern, complex roguelikes seem to be using ECS as an alternative but I think they still have their uses.

ajross · 5 months ago

> I don’t get this. In the generic version the data is embedded with the linked list node, so there’s only one allocation per node.

The reason is that the object being placed "in" the list can have a longer lifetime than the list node, in fact that's generally the case. Like, I work on Zephyr and in Zephyr we have "threads", whose tracking structs are managed by the application code. But they can be in kernel-maintained lists like a scheduler queue, which is an ephemeral state that the essentially-perpetual thread will long outlive. There's no place to allocate a "generic" node at list insert time.

(Also, more importantly, Zephyr is an RTOS and disallows heap use in the core parts of the system to permit use in environments that need to disallow dynamic allocation. But that's not really relevant to a generics discussion.)

But you can trivially put a scheduler queue node right into the thread object, knowing that it's a behavior of the type. The semantics become different though: you only get the one node, it's not possible to have multiple "lists of threads" using this technique unless you know exactly what each list is for ahead of time.

CBLT · 5 months ago

> only one allocation per node

I believe the implication is there's fewer than one allocation per node with the new API. You allocate contiguous memory once, and use it to store n elements.

flowerthoughts · 5 months ago

No, that's not stated.

> The new version isn't generic. Rather, you embed the linked list node with your data. This is known as an intrusive linked list and tends to perform better and require fewer allocations. Except in trivial examples, the data that we store in a linked list is typically stored on the heap. Because an intrusive linked list has the linked list node embedded in the data, it doesn't need its own allocation.

Is simply a misunderstanding. The storage layout will be the same for the generic and the intrusive one.

The benefit of intrusive linked lists is that each node can be a member of several linked lists with a single allocation per node. This is used heavily e.g. in the Linux kernel.

The cost is that you need to keep track of the offset from object address to which linked list pointer you're currently dealing with. That's often a compile time constant, but makes the API more awkward. In this case it seems to be the string "node", which seems nice enough. C libraries often use the preprocessor to do something similar.

    new Node<TStruct>[16];

    new TStructContainingNode[16];

What’s the difference?

lightingthedark · 5 months ago

Can someone explain how the claim of higher performance works here? In C, which lacks generics, an intrusive list is preferred because otherwise you end up with each node having a void pointer to the data it holds. The previous Zig version was a generic, so the data type could easily be a value type. Given that the compiler is allowed to rearrange the layout of both structs unless data is packed or extern (in which case it almost certainly won't want to have a linked list node in the middle anyway) isn't the resulting type exactly the same in memory unless you intentionally made T a pointer type?

ashdnazg · 5 months ago

I don't understand the higher performance either. What I know as the significant advantage is that you can have one object in multiple lists.

messe · 5 months ago

> What I know as the significant advantage is that you can have one object in multiple lists.

Another advantage is smaller code size, as the compiler doesn't need to generate code for SinglyLinkedList(i32), SinglyLinkedList(User), and SinglyLinkedList(PointCloud). This could have a performance impact by making it more likely that code remains in the cache.

Technically that should be possible the other way by using a Node<T> so the type of the second list ends up being a Node<Node<T>> but I can see why an intrusive list would be preferred to that, and also the linked list API might prevent that pattern.

Usually if I have multiple lists holding something I have one that's the 'owner' and then the secondary data structures would have a non-owning pointer to it. Is that the case where the performance would be better with an intrusive list? My intuition would be that having multiple Node members would pollute the cache and not actually be a performance win but maybe it is still better off because it's all colocated? Seems like the kind of thing I'd have to benchmark to know since it would depend on the number of lists and the size of the actual data.

grayhatter · 5 months ago

> Can someone explain how the claim of higher performance works here? In C, which lacks generics, an intrus

I can only give a handwavey answer because I've yet to see any data, and if an engineer tells you something is better but doesn't provide any data, they're not a very good engineer. So grain of salt and all that. But the answer I got was because cache performance. Writing code this way your CPU will spend less time waiting for main memory, and the branch predictor will have better luck. The argument makes sense, but like I said,I've yet to see real world data.

> isn't the resulting type exactly the same in memory unless you intentionally made T a pointer type?

Yes and no. If I understand what you mean, the bit layout will be the 'same'. But I think your confusion is more about how what a compiler means by pointer type, and what a human means. If you pull away enough layers of abstraction, the compiler doesn't see *Type it'll only see *anyopaque, phrased completely differently; according to the compiler, all pointers are the same and are exactly memory_address_size() big. *Type doesn't really exist.

Writing it this way, imagine using just the LinkedList type, without a container of any kind. node to node to node, without any data. While it would be pointless, that would (might) be faster to walk that list, right? There's no extra data loads for the whole struct? That's what this is. Using it this way it gets complicated, but translating theory to asm is always messy. Even more messy when you try to account for speculative execution.

> But I think your confusion is more about how what a compiler means by pointer type, and what a human means. If you pull away enough layers of abstraction, the compiler doesn't see *Type it'll only see *anyopaque, phrased completely differently; according to the compiler, all pointers are the same and are exactly memory_address_size() big. *Type doesn't really exist.

Check out the implementation of SinglyLinkedList in the latest release (before the change in the post)[0]. You'll notice the argument for SinglyLinkedList is (comptime T: type), which is used in the internal Node struct for the data field, which is of type T. Notably, the data field is not a *T.

In Zig, when you call the function SinglyLinkedList with the argument i32 (like SinglyLinkedList(i32)) to return a type for a list of integers, the i32 is used in the place of T, and a Node struct that is unique for i32 is defined and used internally in the returned type. Similarly, if you had a struct like Person with fields like name and age, and you created a list of Persons like SinglyLinkedList(Person), a new Node type would be internally defined for the new struct type returned for Person lists. This internal Node struct would instead use Person in place of T. The memory for the Node type used internally in SinglyLinkedList(Person) actually embeds the memory for the Person type, rather than just containing a pointer to a Person.

These types are very much known to the compiler, as the layout of a Node for a SinglyLinkedList(i32) is not the same as the layout of a Node for a SinglyLinkedList(Person), because the argument T is not used as a pointer. Unless, as the gp mentioned, T is explicitly made to be a pointer (like SinglyLinkedList(*Person)).

[0] https://ziglang.org/documentation/0.14.0/std/#src/std/linked...

anarazel · 5 months ago

Intrusive lists are often used to enqueued pre-existing structures onto lists. And often the same object can be in different lists at different times.

That's not realistically dealt with by the compiler re-organizing the struct layout.

ralferoo · 5 months ago

I don't use Zig, but one advantage of having a genericised intrusive linked list where the next pointer doesn't have to be the first thing in the structure is when you want to use larger types, such as 128-bit fields. Sticking a pointer at the beginning would mean the compiler would have to insert alignment padding after the pointer or break the default alignment.

jwmerrill · 5 months ago

The next pointer doesn’t have to go first in the structure here. It can go anywhere, and you can use @fieldParentPtr to go back from a reference to the embedded node to a reference to the structure.

n42 · 5 months ago

I think that's the point the parent was making. I read it as them describing a benefit of Zig's approach.

spiffyk · 5 months ago

For better or worse, the Zig community has long been embracing the `@fieldParentPtr` builtin as the primary way to do things like OOP-ish inheritance, so this change feels pretty in-character for the language to me.

Really? I did a quick grep in ghostty (160k lines of Zig) and found four total references to @fieldParentPtr. I'm still a Zig baby, but I have not seen very many references to @fieldParentPtr at all.

I would expect the inheritance pattern to be quite rare in performance-aware software design overall, since such constructs tend to be rather cache-unfriendly. So I would attribute the low number of references to inheritance being rare, rather than @fieldParentPtr not being used for inheritance.

MawKKe · 5 months ago

Isn't the new one also horribly type-unsafe? Since you can put (parent) objects of any kind into the list without any mechanism to detect which is which?

kajika91 · 5 months ago

Couldn't we have a function to return the data like this?

nh1996 · 5 months ago

This would require the SinglyLinkedList to be generic and would require that the data struct use "node" as the field name. Also, as some comments have pointed out, this type of linked list can be useful when a data struct needs to be in multiple lists, in which case there is no single "node" field.

  const some_data = struct {
    // Some data fields
    // ...
    bar_node: SinglyLinkedList.Node,
    baz_node: SinglyLinkedList.Node,
  };

sure, just make T == void fall back to the existing implementation.