Stealing bits from pointers makes sense sometimes, but in most cases an easier and less error-prone way to increase memory efficiency is to just use indices or relative pointers. Instead of shaving individual bits of a pointer take advantage of the fact that most pointers in a program are almost exactly alike.
Instead of storing a 64 bit prev and next pointer for an item in a linked list just store e.g. two 16 bit offsets. Then the next_ptr = (this_ptr & base_mask) + offset * alignment.
Most of the time wasting memory by storing full 64bit pointers is no big deal. But when it is, you can go a lot further than just shaving off a few bits.
If you want to be extra clever you can encode values in the base address of your pointers by taking advantage of how virtual memory is allocated. Every process has its own virtual memory pages and you can request memory pages at specific address ranges. For instance if you align all your memory at a 32gb boundary then you can use 32bit pointers + a base pointer without even having to mask. You can also use virtual memory allocation tricks to convey information about alignment, typing, whether a pointer is allowed to be shared between threads and so on. And again, without having to butcher the pointer itself.
Considering how common it is to use indices in lieu of pointers in Rust, it would be nice if it had native NonMax integer types similar to the NonZero types it already has. NonZero gives you the niche-filling optimization (e.g. Option<NonZero> and NonZero are the same size) but with indices you want zero to be a valid value, so it would be natural to use the maximum as the illegal value instead. You can kind of do it by offsetting the index by 1 or inverting the bits then putting it in a NonZero but that's a small extra runtime cost on every use.
You can do a handy struct wrapper over a private Nonzero that xors the given value with the prohibited value (max in this case) at the new constructor. And like so for the get method. Xoring is very cheap. That's my favorite way of storing indices/links for nodes, since you can wrap them in Option for free.
That's really not the application for this, it's tags, which are endlessly useful or characterizing the pointer. For instance you can use the MSBs to establish the type of whats pointed to, or for reference counting. I use the LSB to indicate if the value is a pointer or an unsigned integer, whereby you can store both an (integer) key to a record in a database, or a pointer to the record data in memory. You could use another bit in the pointer to indicate that the record is dirty.
Using unused bits in pointers can make you code more efficient and simpler.
This requires using arenas extensively in your program or perhaps playing tricks with virtual memory, because your system allocator will give you arbitrary pointers by default and you'll often want to point to an object from a different allocation.
Right. Which is why I favor doing your own memory management entirely (which has some huge advantages) or not worrying about memory at all and trusting a garbage collector. I don't think there are many situations left where memory management is too important to leave to a gc but not important enough to do from scratch.
Definitely indices are a great option, but then you need a base pointer and a way to allocate off of it. That can add significant complexity. So it is all a tradeoff.
I was just thinking this. You'd either need to special-case your debuggers or be able to use something that provided bespoke hooks for pointer-following.
- The game used a tricky pointer hack, basically on the PSX accessing a pointer with the highest-24-bit set means read it from the CPU cache, otherwise not (or maybe the other way around). This was used for example to indicate whether the C4 bomb was planted on the ground, or on the wall instead of keeping a booblean/bit flag for it. Possibly was used for some more things. Since it was 24-bit, that meant 16mb.
Are there ways on Linux to make sure that all user-visible pointers in a process are allocated within some subset of the overall address space? For instance, imagine writing a 64-bit program such that all userspace pointers are allocated within the first 32 bits. This would allow for representing them in memory using four bytes, as is commonly done on 32-bit architectures--without adding more extensive or ad-hoc changes such as a new "x32" binary format. Except that you could also limit to 24-bit (giving 16M of address space), 16-bit (for a "tiny" code model) or anything else. Of course, this would require some amount of support in the linker/loader, userspace libraries and the kernel itself. But the effort might be worthwhile if it can allow for replacing more ad-hoc approaches.
Linux mmap (since 2.6) has a MAP_32BIT flag, but only for x86-64:
Put the mapping into the first 2 Gigabytes of the process address space. This flag is supported only on x86-64, for 64-bit programs. It was added to allow thread stacks to be allocated somewhere in the first 2 GB of memory, so as to improve context-switch performance on some early 64-bit processors. Modern x86-64 processors no longer have this performance problem, so use of this flag is not required on those systems. The MAP_32BIT flag is ignored when MAP_FIXED is set.
I never had the problem of too many pointers, but if I did I'd try the "arena"[0] approach.
Preallocate a huge area of memory and use smaller indexes relative to the start instead. Provided a pointer is bigger than a int. You might be able to use this in that approach and keep partial pointers instead.
> What do you call a pointer with the high bits stolen? An ointer!
What is the name a pointer with the low bits stolen? Machines with strict word addressing will ignore the bottom two bits so you can safely store what you want there.
Not sure I've used, much less programmed a machine with that strict constraint in a while though. These days CPUs are all designed to implement the C abstract machine. At least the GPU folks are not subject to this constraint.
Instead of storing a 64 bit prev and next pointer for an item in a linked list just store e.g. two 16 bit offsets. Then the next_ptr = (this_ptr & base_mask) + offset * alignment.
Most of the time wasting memory by storing full 64bit pointers is no big deal. But when it is, you can go a lot further than just shaving off a few bits.
If you want to be extra clever you can encode values in the base address of your pointers by taking advantage of how virtual memory is allocated. Every process has its own virtual memory pages and you can request memory pages at specific address ranges. For instance if you align all your memory at a 32gb boundary then you can use 32bit pointers + a base pointer without even having to mask. You can also use virtual memory allocation tricks to convey information about alignment, typing, whether a pointer is allowed to be shared between threads and so on. And again, without having to butcher the pointer itself.
Using unused bits in pointers can make you code more efficient and simpler.
https://news.ycombinator.com/item?id=9739731
to quote me:
- The game used a tricky pointer hack, basically on the PSX accessing a pointer with the highest-24-bit set means read it from the CPU cache, otherwise not (or maybe the other way around). This was used for example to indicate whether the C4 bomb was planted on the ground, or on the wall instead of keeping a booblean/bit flag for it. Possibly was used for some more things. Since it was 24-bit, that meant 16mb.
Put the mapping into the first 2 Gigabytes of the process address space. This flag is supported only on x86-64, for 64-bit programs. It was added to allow thread stacks to be allocated somewhere in the first 2 GB of memory, so as to improve context-switch performance on some early 64-bit processors. Modern x86-64 processors no longer have this performance problem, so use of this flag is not required on those systems. The MAP_32BIT flag is ignored when MAP_FIXED is set.
Realistically you should just mmap a huge aligned chunk in the first place, then allocate inside that.
Preallocate a huge area of memory and use smaller indexes relative to the start instead. Provided a pointer is bigger than a int. You might be able to use this in that approach and keep partial pointers instead.
[0]https://www.rfleury.com/p/untangling-lifetimes-the-arena-all...
What is the name a pointer with the low bits stolen? Machines with strict word addressing will ignore the bottom two bits so you can safely store what you want there.
Not sure I've used, much less programmed a machine with that strict constraint in a while though. These days CPUs are all designed to implement the C abstract machine. At least the GPU folks are not subject to this constraint.