I think this is an enormously bad idea. The whole point of using sun.misc.Unsafe is because you need to do something with basically no overhead, regardless of safety. In my opinion, they should just modularize it such that users must enable access to it explicitly. The MemorySegment API is nice, but sometimes you really want to skip bounds checking in a cross platform way.
The cost of bounds checks, which is already low, could be reduced further the vast majority of uses with even more enhancements to the compiler's inlining decisions, so we're talking about an API whose value is already small and constantly declining. However, those who think they absolutely need to avoid bounds checks could, indeed, use a flag to access the JDK's internals. We don't think that such a small benefit (especially when integrated over all users) merits a supported API.
It will be the difference between using Java, or the better suited low level coding features from C#, Go, Rust, assuming library parity for the task being solved.
If after this change, Java is no longer able to compete in "The One Billion Row Challenge", it is a loss for the ecosystem.
I’m all for trimming the stdlib and rectifying mistakes, but have you evaluated how much of the ecosystem, particularly the data processing systems, such as Kafka, Cassandra, Elasticsearch, HBase to name a few, depend on it? You say “integrated over all users”—whom are you referring to?
You might be correct, but it's not obvious to me unsafe is really faster. I recall reading that using unsafe invalidates most of the other optimizations because it's opaque to the JIT what you are doing. Also, if there are BCE opportunities, I feel pretty confident OpenJDK will add them, rather than being unsympathetic.
In terms of performance: I realize that this is a somewhat "toy" issue, and it's a sample size of 1, but for the currently ongoing "One Billion Row Challenge"[1] (an ongoing Java performance competition related to parsing and aggregating a 13 GB file), all of the current top-performers are using Unsafe. More specifically, the use of Unsafe appears to have been the change for a few entries that allowed getting below the 3-second barrier in the test.
> sometimes you really want to skip bounds checking in a cross platform way
What is a use case where this is crucial? Where the difference is not just to win a benchmark pissing contest, but where the delta is so large that it is a factor of the JVM being a valid platform or not, and where the JVM would still be the best overall solution?
Put differently: if you need extreme performance and full low-level control, what is a scenario where you still choose the JVM over eg Rust, and where bounds-checking elimination would be the deciding factor?
If you really insist on shooting yourself in the foot, you can do MemorySegment.ofAddress(0).reinterpret(Long.MAX_VALUE) to obtain a MemorySegment for the lower half of the address space of your process. Obviously, you set the address to Long.MAX_VALUE for the upper half.
<<
Over the past several years, we have introduced two standard APIs that are safe and performant replacements for the memory-access methods in sun.misc.Unsafe:
java.lang.invoke.VarHandle, introduced in JDK 9, provides methods to safely and efficiently manipulate on-heap memory: fields of objects, static fields of classes, and elements of arrays.
java.lang.foreign.MemorySegment, introduced in JDK 22, provides methods to safely and efficiently access off-heap memory (sometimes in cooperation with VarHandle).
These standard APIs guarantee no undefined behavior, promise long-term stability, and have high-quality integration with the tooling and documentation of the Java Platform (examples of their use are given below). Given the availability of these APIs, it is now appropriate to deprecate and eventually remove the memory-access methods in sun.misc.Unsafe.
>>
As the comment you replied to indicates, both of those APIs perform bounds-checking. In certain tight loops, this can add up to quite a bit of overhead [1]. However, it's not documented, but if you really know what you are doing you can convince the JIT to elide the bounds checks for MemorySegments [2].
Once I did some experiments at programming in Java using only sun.misc.Unsafe for a memory access: https://github.com/xonixx/gc_less. I was able to implement this way basic data structures (array, array list, hash table). I even explicitely set a very small heap and used Epsilon GC to make sure I don't allocate in heap.
Just recently I decided to check if it still works in the latest Java (23) and to my surprise it appears - it is. Now, apparently, this is going to change.
The main sting here seems to be that MemorySegment when allocated from Java code requires bounds checks on all accesses, which the JVM cannot optimise away on random accesses.
However, MemorySegment itself already supports unbounded access when wrapping a native pointer. It’s necessary for even basic interactions with native code (e.g. reading a c-str).
It should be easy to “wash” an off-heap MemorySegment through native code to obtain an unbounded MemorySegment from it. Something like:
void *vanish_bounds(void *ptr) { return ptr; }
I think it would be nice if the FFI provided a way to get an unbounded MemorySegment without resorting to native code hacks though. It can obviously be made restricted like unbounded MemorySegment’s already are.
That should take most of the sting out of this proposal (although it still sounds like a painful transition for a very modest gain).
First, please clarify whether you are asking about performance impacts vs. no bounds checking or vs. bounds checking in software.
The "vs. no bounds checking" deals with the fact that the bounds must be stored somewhere, so you get an additional memory access, and that's slow (best case, puts some load on the caches).
The "vs. software" is more like RISC vs. CISC in general.
> First, please clarify whether you are asking about performance impacts vs. no bounds checking or vs. bounds checking in software
I was wondering if it would be feasible to have hardware checks with practically no speed difference to performing no check at all.
Yes I imagined it could be seen as RISC vs CISC, but it's such a fundamental and frequent operation that it seems likely it would help overall..!
Of course you'd need to use registers; which yes means higher cost (unless you already have some for sure never in use at bound-checking times), but there seems to be a good chance it would be worthwhile..?
I imagine that the impact of simply reading the additional instructions would be negligible, by the way
Most of sun.misc.Unsafe forwards to jdk.internal, which will still exist and be accessible via --add-exports on the command line. This is a good middle ground for strong encapsulation and making sure users know that they're using internal api's that could change.
If after this change, Java is no longer able to compete in "The One Billion Row Challenge", it is a loss for the ecosystem.
This is subjective. In my opinion, even 10%~ is massive. Performance matters and this would be a step backwards.
1. https://github.com/gunnarmorling/1brc
Quick publication showing the difference (up to 125%!): https://www2.cs.arizona.edu/~dkl/Publications/Papers/ics.pdf
(and no, the story hasn't changed an awful lot since 2004. The gap still exists.)
What is a use case where this is crucial? Where the difference is not just to win a benchmark pissing contest, but where the delta is so large that it is a factor of the JVM being a valid platform or not, and where the JVM would still be the best overall solution?
Put differently: if you need extreme performance and full low-level control, what is a scenario where you still choose the JVM over eg Rust, and where bounds-checking elimination would be the deciding factor?
<< Over the past several years, we have introduced two standard APIs that are safe and performant replacements for the memory-access methods in sun.misc.Unsafe:
java.lang.invoke.VarHandle, introduced in JDK 9, provides methods to safely and efficiently manipulate on-heap memory: fields of objects, static fields of classes, and elements of arrays.
java.lang.foreign.MemorySegment, introduced in JDK 22, provides methods to safely and efficiently access off-heap memory (sometimes in cooperation with VarHandle).
These standard APIs guarantee no undefined behavior, promise long-term stability, and have high-quality integration with the tooling and documentation of the Java Platform (examples of their use are given below). Given the availability of these APIs, it is now appropriate to deprecate and eventually remove the memory-access methods in sun.misc.Unsafe. >>
[1] https://mail.openjdk.org/pipermail/panama-dev/2023-July/0193...
[2] https://mail.openjdk.org/pipermail/panama-dev/2023-July/0194...
Once I did some experiments at programming in Java using only sun.misc.Unsafe for a memory access: https://github.com/xonixx/gc_less. I was able to implement this way basic data structures (array, array list, hash table). I even explicitely set a very small heap and used Epsilon GC to make sure I don't allocate in heap.
Just recently I decided to check if it still works in the latest Java (23) and to my surprise it appears - it is. Now, apparently, this is going to change.
However, MemorySegment itself already supports unbounded access when wrapping a native pointer. It’s necessary for even basic interactions with native code (e.g. reading a c-str).
It should be easy to “wash” an off-heap MemorySegment through native code to obtain an unbounded MemorySegment from it. Something like:
I think it would be nice if the FFI provided a way to get an unbounded MemorySegment without resorting to native code hacks though. It can obviously be made restricted like unbounded MemorySegment’s already are.That should take most of the sting out of this proposal (although it still sounds like a painful transition for a very modest gain).
Is it inherently difficult to do it without performance impacts? Or nobody cared too much for the prevalence of trust-the-developer c thinking?
I found some information at https://stackoverflow.com/questions/40752436/do-any-cpus-hav...
The "vs. no bounds checking" deals with the fact that the bounds must be stored somewhere, so you get an additional memory access, and that's slow (best case, puts some load on the caches).
The "vs. software" is more like RISC vs. CISC in general.
I was wondering if it would be feasible to have hardware checks with practically no speed difference to performing no check at all.
Yes I imagined it could be seen as RISC vs CISC, but it's such a fundamental and frequent operation that it seems likely it would help overall..!
Of course you'd need to use registers; which yes means higher cost (unless you already have some for sure never in use at bound-checking times), but there seems to be a good chance it would be worthwhile..?
I imagine that the impact of simply reading the additional instructions would be negligible, by the way
Maybe, if the Devs can replicate the perf with VarHandle and MemorySegment, it could go away eventually.