VorpalWay (u/VorpalWay)

VorpalWay commented on Emacs internals: Tagged pointers vs. C++ std:variant and LLVM (Part 3) thecloudlet.github.io/blo... · Posted by u/thecloudlet

uecker · 2 days ago

No, angelic non-determinism is not related to the as-if rule. It essentially says that if there is a choice to assign provenance on backconversion from integers, the one which makes the program valid is assigned. This is basically the same as the explicit UDI rule in TS 6010, except that this is rule is very clear. The problematic with angelic non-determinism is two-fold: a) most people will not be able to reason about it at all, and b) not even formal semantics experts know what it means in complicated cases. Demonic non-determinism essentially means that all possible execution must be valid while angelic non-determinism that there must exist at least one. Formally, this translates to universal and existential quantifiers. But for quantifiers, you must know where and in which order to place them in a formula, which wasn't clear all from the wording I have seen (a while ago). The interaction with concurrency is also a can of worms.

I don't think there is a fundamental advantage to Rust regarding provenance. Yes, we lack a way to do pointer tagging without exposing the provenance in C, but we could easily add this. But this is all moot as long as compilers are still not conforming to the provenance model with respect to integer and pointer casts anyway and this breaks Rust too! Rust having decided something just means they life in fairy tale world, while C/C++ not having decided means they acknowledge the reality that compilers haven't fixed their optimizers. (Even ignoring that "deciding" means entirely different things here anyway with C/C++ having ISO standards.)

VorpalWay · 2 days ago

> But this is all moot as long as compilers are still not conforming to the provenance model with respect to integer and pointer casts anyway and this breaks Rust too! Rust having decided something just means they life in fairy tale world, while C/C++ not having decided means they acknowledge the reality that compilers haven't fixed their optimizers.

I think this is a bit of a mischaracterization. While there can of course be bugs in LLVM (and rustc and clang), what sort of LLVM IR you generate matters. To be able to generate IR that conforms to the provenance model of the language you first need to have such a model.

As far as I know (and this matches what I found when search the rust issue tracker) there is currently one major known LLVM bug in this area (https://github.com/rust-lang/rust/issues/147538) with partial workarounds applied on the Rust side. There is some issues with open question still, such as how certain unstable features should interact with provenance.

I think calling the current situation "fairy tale world" is a gross exaggeration. Is it perfectly free of bugs? No, but if that is the criteria, then the entirety of any compiler is a fairy tale (possibly with the exception of some formally verified compiler).

VorpalWay commented on Hyperlinks in terminal emulators gist.github.com/egmontkob... · Posted by u/nvahalik

VorpalWay · 2 days ago

I have found this really useful together with file:// links. If properly set up, you can use this to go to a specific file, line and column in your IDE/editor even. Very useful with custom lint and debug tooling that I have written for my dayjob.

VorpalWay commented on Emacs internals: Tagged pointers vs. C++ std:variant and LLVM (Part 3) thecloudlet.github.io/blo... · Posted by u/thecloudlet

jcranmer · 2 days ago

> But the likely destiny of C++ is to inherit the provenance rules that are an adjunct to C23, PNVI-ae-udi, Provenance Not Via Integers, Addresses Exposed, User Disambiguates

There's a competing proposal in C++ land to add provenance via angelic nondeterminism: if there's some provenance that makes the code non-UB, then use that provenance. (As you might imagine, I'm not a big fan of that proposal, but WG21 seems to love it a lot more than I do.)

VorpalWay · 2 days ago

Very interesting discussion. I hadn't realised that the final provenance model hadn't yet been decided for C and C++.

Angelic non-determinism seems difficult to use to determine if an optimisation is valid. If I understand this correctly, it is basically the as-if rule, but in this case applied to something that potentially needs global program analysis. Would that be an accurate understanding?

It sounds like both of these proposals will be strictly less able to optimize than strict provenance in rust to me. In particular, Rust allows applying a closure/lambda to map a pointer while keeping the provenance. That avoids exposing the provenance as you add and remove tag bits, which should at least in theory allow LLVM to optimise better. (But this keeps the value as a pointer, and having a dangling pointer that you don't access is fine in Rust, probably not in C?)

I'm not sure why I'm surprised actually, Rust can be a more sensible language in places thanks to hindsight. We see this in being able to use LLVM noalias (restrict basically) in more places thanks to the different aliasing model, while still not having the error prone TBAA of C and C++. And it doesn't need a model of memory consisting of typed objects (rather it is all just bytes and gets materialised into a value of a type on access).

VorpalWay commented on Emacs internals: Tagged pointers vs. C++ std:variant and LLVM (Part 3) thecloudlet.github.io/blo... · Posted by u/thecloudlet

trws · 3 days ago

Everything else in the siblings is true, but remember that the language and std types in rust all do this already. Most of the time it’s better to use a native enum or optional/result because they do this in the compiler/lib. It’s only really worth it if you need more than a few types or need precise control of the representation for C interop or something.

VorpalWay · 3 days ago

To expand on the sibling answer: sort of! Rust will do niche optimisation, but for references and NonNull pointers this is limited to "the value 0 is invalid and can thus be used as a niche". But Rust does not (currently) take advantage of alignment niches in pointers. Nor does it use high bit on architectures where you know your whole theoretical address space isn't actually in use.

Is doing that manually worth it? Usually not, but for some core types (classical example is strings) or in language runtimes it can be.

Would it be awesome if this could be done automatically? Absolutely, but I understand it is a large change, and the plan is to later build upon the pattern types that are currently work in progress (and would allow you to specify custom ranged integer typed).

VorpalWay commented on Emacs internals: Tagged pointers vs. C++ std:variant and LLVM (Part 3) thecloudlet.github.io/blo... · Posted by u/thecloudlet

thecloudlet · 3 days ago

Doing bitwise operations directly on raw pointers is a fast track to Undefined Behavior in standard C/C++. Emacs gets away with it largely due to its age, its heavy reliance on specific GCC behaviors/extensions, and how its build system configures compiler optimizations.

In modern C++, the technically "correct" and safe way to spell this trick is exactly as you suggested: using uintptr_t (or intptr_t).

VorpalWay · 3 days ago

Do (u)intptr_t preserve provenance? Or does this count as exposed provenance when you convert back and forth?

Maybe that is not the correct C++ terminology, I'm more familiar with how provenance works in Rust, where large parts of it got stabilised a little over a year ago. (What was stabilised was "strict provenance", which is a set of rules that if you abide them will definitely be correct, but it is possible the rules might be loosened in the future to be more lenient.)

https://doc.rust-lang.org/std/ptr/index.html#provenance

VorpalWay commented on Tested: How Many Times Can a DVD±RW Be Rewritten? Methodology and Results goughlui.com/2026/03/07/t... · Posted by u/giuliomagnifico

parineum · 3 days ago

> You could also use an OS that doesn't tend to have dodgy updates that brick your system, such as most Linux distro.

I haven't done it recently but back when I was learning Linux, I definitely bricked my fair share of installations updating and installing things.

It was probably fixable to a more experienced person but it wasn't to me.

Linux is a lot of things but brick-proof for novice users isn't one of them.

VorpalWay · 3 days ago

It has gotten a lot better from what I can tell, though that is just based on what I see others struggle with (or not struggle with as may be the case).

I can't judge this directly (I'm in way too deep, running Arch etc), I first started using Linux seriously in 2004, stopped using Windows except for gaming by 2006, and touched it less and less over the years. I have not used Windows 11 at all.

VorpalWay commented on Tested: How Many Times Can a DVD±RW Be Rewritten? Methodology and Results goughlui.com/2026/03/07/t... · Posted by u/giuliomagnifico

fc417fc802 · 3 days ago

> if you are stuck with such software

kvm-qemu, windows image, block network access to the windows update servers, problem solved?

VorpalWay · 3 days ago

I never managed to get Fusion 360 running reasonably on Linux, in the end I switched CAD software. It really needs some sort of reasonable OpenGL support (or maybe DirectX, I forget which it was). And it doesn't work under wine, it did at some point but then it stopped. Cloud connected software, so you can't just run an old version.

Maybe if you had a second GPU and forwarded it to the VM? Not willing to spend that extra money, and it would only work on my desktop, not my laptop.

VorpalWay commented on Tested: How Many Times Can a DVD±RW Be Rewritten? Methodology and Results goughlui.com/2026/03/07/t... · Posted by u/giuliomagnifico

avidiax · 3 days ago

One hint for the wary: Don't delay feature updates for the maximum allowed in the group policy editor. I couldn't figure out why I was getting forced reboots for updates despite other policies requiring it to ask permission. Turns out that if the update hits the group policy maximum, it forces an update immediately, other policies be damned.

So set it to the max - 14 days if you want some time to apply updates at your leisure, and you are wary of non-critical updates.

VorpalWay · 3 days ago

You could also use an OS that doesn't tend to have dodgy updates that brick your system, such as most Linux distro. Nor force you to update if you don't want to.

Funny how a large company like Microsoft can't figure out QA, but volunteer Linux distros with much less resources can.

(A lot of Windows specific software works in wine these days, Valve's investment into improving it for games have helped for applications too. Not everything, and if you are stuck with such software, yeah that sucks.)

VorpalWay commented on Standardizing source maps bloomberg.github.io/js-bl... · Posted by u/Timothee

tliltocatl · 3 days ago

> One of them is even Turing complete.

> figuring out where in memory or registers a given high level variable

Isn't the task itself Turing-hard? Or at least complex enough so that coming up with a non-Turing-complete solution would be impractical?

VorpalWay · 3 days ago

Good question, that I don't fully know the answer to. The rest of the byte code (apart from the primitives that enable looping) already allow expressing a lot. From memory (it has been almost half a year since I last worked on this), you can specify things like "for this 32 bit value, the first two bytes can be found in the middle of RAX, the third byte is found following this chain of pointers, and the final byte is on the stack" without even touching the TC parts.

Basically, my impression was that that the format was flexible enough that I couldn't see why you would need the TC parts in practice. The compilers seemed to agree and not use it in practise (at least gcc and llvm).

This was of interest to me since I was generating BPF code from these (for user space trace points) and BPF is famously and intentionally not TC. I could translate many patterns that do show up in real world code, but not the general case.

VorpalWay commented on Temporal: The 9-year journey to fix time in JavaScript bloomberg.github.io/js-bl... · Posted by u/robpalmer

VorpalWay · 4 days ago

It is not just in time keeping that mutable shared state is an issue, I have seen problems arising from it elsewhere as well in Python especially, but also in C and C++. Probably because Python is pass by reference implicitly, while C and C++ makes pointers/references more explicit, thus reducing the risk of such errors in the code.

There a few schools of thought about what should be done about it. One is to make (almost) everything immutable and hope it gets optimised away/is fast enough anyway. This is the approach taken by functional languages (and functional style programming in general).

Another approach is what Rust does: make state mutable xor shared. So you can either have mutable state that you own exclusively, or you can have read only state that is shared.

Both approaches are valid and helpful in my experience. As someone working with low level performance critical code, I personally prefer the Rust approach here.