Ointers: A library for representing pointers where bits have been stolen (2021)

Stealing bits from pointers makes sense sometimes, but in most cases an easier and less error-prone way to increase memory efficiency is to just use indices or relative pointers. Instead of shaving individual bits of a pointer take advantage of the fact that most pointers in a program are almost exactly alike.

Instead of storing a 64 bit prev and next pointer for an item in a linked list just store e.g. two 16 bit offsets. Then the next_ptr = (this_ptr & base_mask) + offset * alignment.

Most of the time wasting memory by storing full 64bit pointers is no big deal. But when it is, you can go a lot further than just shaving off a few bits.

If you want to be extra clever you can encode values in the base address of your pointers by taking advantage of how virtual memory is allocated. Every process has its own virtual memory pages and you can request memory pages at specific address ranges. For instance if you align all your memory at a 32gb boundary then you can use 32bit pointers + a base pointer without even having to mask. You can also use virtual memory allocation tricks to convey information about alignment, typing, whether a pointer is allowed to be shared between threads and so on. And again, without having to butcher the pointer itself.

jsheard · 2 years ago

Considering how common it is to use indices in lieu of pointers in Rust, it would be nice if it had native NonMax integer types similar to the NonZero types it already has. NonZero gives you the niche-filling optimization (e.g. Option<NonZero> and NonZero are the same size) but with indices you want zero to be a valid value, so it would be natural to use the maximum as the illegal value instead. You can kind of do it by offsetting the index by 1 or inverting the bits then putting it in a NonZero but that's a small extra runtime cost on every use.

joseluis · 2 years ago

You can do a handy struct wrapper over a private Nonzero that xors the given value with the prohibited value (max in this case) at the new constructor. And like so for the get method. Xoring is very cheap. That's my favorite way of storing indices/links for nodes, since you can wrap them in Option for free.

dkjaudyeqooe · 2 years ago

That's really not the application for this, it's tags, which are endlessly useful or characterizing the pointer. For instance you can use the MSBs to establish the type of whats pointed to, or for reference counting. I use the LSB to indicate if the value is a pointer or an unsigned integer, whereby you can store both an (integer) key to a record in a database, or a pointer to the record data in memory. You could use another bit in the pointer to indicate that the record is dirty.

Using unused bits in pointers can make you code more efficient and simpler.

westurner · 2 years ago

Can unused bits be used for signed pointers or CRCs?

zozbot234 · 2 years ago

This requires using arenas extensively in your program or perhaps playing tricks with virtual memory, because your system allocator will give you arbitrary pointers by default and you'll often want to point to an object from a different allocation.

gizmo · 2 years ago

Right. Which is why I favor doing your own memory management entirely (which has some huge advantages) or not worrying about memory at all and trusting a garbage collector. I don't think there are many situations left where memory management is too important to leave to a gc but not important enough to do from scratch.

gpderetta · 2 years ago

Definitely indices are a great option, but then you need a base pointer and a way to allocate off of it. That can add significant complexity. So it is all a tradeoff.

Are there ways on Linux to make sure that all user-visible pointers in a process are allocated within some subset of the overall address space? For instance, imagine writing a 64-bit program such that all userspace pointers are allocated within the first 32 bits. This would allow for representing them in memory using four bytes, as is commonly done on 32-bit architectures--without adding more extensive or ad-hoc changes such as a new "x32" binary format. Except that you could also limit to 24-bit (giving 16M of address space), 16-bit (for a "tiny" code model) or anything else. Of course, this would require some amount of support in the linker/loader, userspace libraries and the kernel itself. But the effort might be worthwhile if it can allow for replacing more ad-hoc approaches.

ksherlock · 2 years ago

Linux mmap (since 2.6) has a MAP_32BIT flag, but only for x86-64:

Put the mapping into the first 2 Gigabytes of the process address space. This flag is supported only on x86-64, for 64-bit programs. It was added to allow thread stacks to be allocated somewhere in the first 2 GB of memory, so as to improve context-switch performance on some early 64-bit processors. Modern x86-64 processors no longer have this performance problem, so use of this flag is not required on those systems. The MAP_32BIT flag is ignored when MAP_FIXED is set.

fathyb · 2 years ago

An allocator is free to only `mmap` addresses that are within a range. I think jemalloc could allow you to do that using custom arenas.

o11c · 2 years ago

ASAN relies on kernel internal details, but this has broken a few times.

Realistically you should just mmap a huge aligned chunk in the first place, then allocate inside that.

pavlov · 2 years ago

Technically it’s an ointer on PowerPC and SPARC, but a pointe everywhere else.

sojuz151 · 2 years ago

This look like something that debuggers would hate, because following pointers would be broken

jxf · 2 years ago

I was just thinking this. You'd either need to special-case your debuggers or be able to use something that provided bespoke hooks for pointer-following.

Wrap it in a class and teach the debugger to interpret it. It might be possible with gdb and python pretty-printers for example.

malkia · 2 years ago

I've posted about MGS PS1 port we did to PC and how some pointers to the C4 bomb planted posted meant it was on the wall or on the ground - here ->

https://news.ycombinator.com/item?id=9739731

to quote me:

- The game used a tricky pointer hack, basically on the PSX accessing a pointer with the highest-24-bit set means read it from the CPU cache, otherwise not (or maybe the other way around). This was used for example to indicate whether the C4 bomb was planted on the ground, or on the wall instead of keeping a booblean/bit flag for it. Possibly was used for some more things. Since it was 24-bit, that meant 16mb.

willcipriano · 2 years ago

I never had the problem of too many pointers, but if I did I'd try the "arena"[0] approach.

Preallocate a huge area of memory and use smaller indexes relative to the start instead. Provided a pointer is bigger than a int. You might be able to use this in that approach and keep partial pointers instead.

[0]https://www.rfleury.com/p/untangling-lifetimes-the-arena-all...

gumby · 2 years ago

> What do you call a pointer with the high bits stolen? An ointer!

What is the name a pointer with the low bits stolen? Machines with strict word addressing will ignore the bottom two bits so you can safely store what you want there.

Not sure I've used, much less programmed a machine with that strict constraint in a while though. These days CPUs are all designed to implement the C abstract machine. At least the GPU folks are not subject to this constraint.

asb · 2 years ago

I somehow hadn't come across this library, but have a whole blog post on the various ways people store data in pointers (and when+why it's safe) https://muxup.com/2023q4/storing-data-in-pointers

makz · 2 years ago

Awesome, been looking for this information for ages already.