Chicory: A JVM native WebAssembly runtime

jmillikin · 6 months ago

Chicory seems like it'll be pretty useful. Java doesn't have easy access to the platform-specific security mechanisms (seccomp, etc) that are used by native tools to sandbox their plugins, so it's nice to have WebAssembly's well-designed security model in a pure-JVM library.

I've used it to experiment with using WebAssembly to extend the Bazel build system (which is written in Java). Currently there are several Bazel rulesets that need platform-specific helper binaries for things like parsing lock files or Cargo configs, and that's exactly the kind of logic that could happily move into a WebAssembly blob.

https://github.com/jmillikin/upstream__bazel/commits/repo-ru...

https://github.com/bazelbuild/bazel/discussions/23487

blacklion · 6 months ago

I don't understand logic and layers of abstraction here.

Chicory runs on JVM. Bazel runs on JVM. How inserting WebAssembly layer will help to eliminate platform-specific helper binaries? These binaries compiled to WebAssembly will be run, effectively, on JVM (through one additional layer of APIs provided by Chicory), right? Why you cannot write these helpers directly in JVM language, Java, Kotlin, Clojure, anything? Why do you need additional layer of Chicory?

andreaTP · 6 months ago

You don't, just, easily rewrite everything. Being able to just re-use is the trick!

gf000 · 6 months ago

I really don't want to sound flamewar-y, but how is WebAssmebly's security model well-designed compared to a pure Java implementation of a brainfuck interpreter? Similarly, java byte code is 100% safe if you just don't plug in filesystem/OS capabilities.

It's trivial to be secure when you are completely sealed off from everything. The "art of the deal" is making it safe while having many capabilities. If you add WASI to the picture it doesn't look all that safe, but I might just not be too knowledgeable about it.

kannanvijayan · 6 months ago

It's really difficult to compare the JVM and wasm because they are such different beasts with such different use cases.

What wasm brings to the table is that the core tech focuses on one problem: abstract sandboxed computation. The main advantage it brings is that it _doesn't_ carry all the baggage of a full fledged runtime environment with lots of implicit plumbing that touches the system.

This makes it flexible and applicable to situations where java never could be - incorporating pluggable bits of logic into high-frequency glue code.

Wasm + some DB API is a pure stored procedure compute abstraction that's client-specifiable and safe.

Wasm + a simple file API that assumes a single underlying file + a stream API that assumes a single outgoing stream, that's a beautiful piece of plumbing for an S3 like service that lets you dynamically process files on the server before downloading the post-processed data.

There are a ton of use cases where "X + pluggable sandboxed compute" is power-multiplier for the underlying X.

I don't think the future of wasm is going to be in the use case where we plumb a very classical system API onto it (although that use case will exist). The real applicability and reach of wasm is the fact that entire software architectures can be built around the notion of mobile code where the signature (i.e. external API that it requires to run) of the mobile code can be allowed to vary on a use-case basis.

hinkley · 6 months ago

The bespoke capability model in Java has always been so fiddly it has made me question the concept of capability models. There’s was for a long time a constant stream of new privilege escalations mostly caused by new functions being added that didn’t necessarily break the model themselves, but they returned objects that contained references to objects that contained references to data that the code shouldn’t have been able to see. Nobody to my recollection ever made an obvious back door but nonobvious ones were fairly common.

I don’t know where things are today because I don’t use Java anymore, but if you want to give some code access to a single file then you’re in good hands. If you want to keep them from exfiltrating data you might find yourself in an Eternal Vigilance situation, in which case you’ll have to keep on top of security fixes.

We did a whole RBAC system as a thin layer on top of JAAS. Once I figured out a better way to organize the config it wasn’t half bad. I still got too many questions about it, which is usually a sign of ergonomic problems that people aren’t knowledgeable enough to call you out on. But it was a shorter conversation with fewer frowns than the PoC my coworker left for me to productize.

bhelx · 6 months ago

WASI does open up some holes you should be considerate of. But it's still much safer than other implementations. We don't allow you direct access to the FS we use jimfs: https://github.com/google/jimfs

I typically recommend people don't allow wasm plugins to talk to the filesystem though, unless they really need to read some things from disk like a python interpreter. You don't usually need to.

worthless-trash · 6 months ago

I wouldn't say 100% safe. I was able to abuse the JVM to use spectre gadgets to find secret memory contents (aka private keys) on the JVM. It was tough but lets not overexagerate about JVM safety.

pjmlp · 6 months ago

Pssst, it is the usual WebAssembly sales pitch.

Linear memory accesses aren't bound checked inside the linear memory segment, thus data can still be corrupted, even if it doesn't leave the sandbox.

Also just like many other bytecode based implementations, it is as safe as the implementations, that can be equally attacked.

https://webassembly.org/docs/security/

https://www.usenix.org/conference/usenixsecurity20/presentat...

https://www.usenix.org/conference/usenixsecurity21/presentat...

https://www.usenix.org/conference/usenixsecurity22/presentat...

andreaTP · 6 months ago

Looking forward to seeing more Chicory in Bazel, is a great use-case! Thanks for spearheading it!

throwaway894345 · 6 months ago

> Java doesn't have easy access to the platform-specific security mechanisms (seccomp, etc) that are used by native tools to sandbox their plugins, so it's nice to have WebAssembly's well-designed security model in a pure-JVM library.

I thought Java had all of this sandboxing stuff baked in? Wasn't that a big selling point for the JVM once upon a time? Every other WASM thread has someone talking about how WASM is unnecessary because JVM exists, so the idea that JVM actually needs WASM to do sandboxing seems pretty surprising!

jmillikin · 6 months ago

The JVM was designed with the intention of being a secure sandbox, and a lot of its early adoption was as Java applets that ran untrusted code in a browser context. It was a serious attempt by smart people to achieve a goal very similar to that of WebAssembly.

Unfortunately Java was designed in the 1990s, when there was much less knowledge about software security -- especially sandboxing of untrusted code. So even though the goal was the same, Java's design had some flaws that made it difficult to write a secure JVM.

The biggest flaw (IMO) was that the sandbox layer was internal to the VM: in modern thought the VM is the security boundary, but the JVM allows trusted and untrusted code to execute in the same VM, with java.lang.SecurityManager[0] and friends as the security mechanism. So the attack surface isn't the bytecode interpreter or JIT, it's the entire Java standard library plus every third-party module that's linked in or loaded.

During the 2000s and 2010s there were a lot of Java sandbox escape CVEs. A representative example is <https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-0422>. Basically the Java security model was broken, but fixing it would break backwards compatibility in a major way.

--

Around the same time (early-mid 2010s) there more thought being put into sandboxing native code, and the general consensus was:

- Sandboxing code within the same process space requires an extremely restricted API. The original seccomp only allowed read(), write(), exit(), and sigreturn() -- it could be used for distributed computation, but compiling existing libraries into a seccomp-compatible dylib was basically impossible.

- The newly-developed virtualization instructions in modern hardware made it practical to run a virtual x86 machine for each untrusted process. The security properties of VMs are great, but the x86 instruction set has some properties that make it difficult to verify and JIT-compile, so actually sitting down and writing a secure VM was still a major work of engineering (see: QEMU, VMWare, VirtualBox, and Firecracker).

Smartphones were the first widespread adoption of non-x86 architectures among consumers since PowerPC, and every smartphone had a modern web browser built in. There was increasing desire to have something better than JavaScript for writing complex web applications executing in a power-constrained device. Java would have been the obvious choice (this was pre-Oracle), except for the sandbox escape problem.

WebAssembly combines architecture-independent bytecode (like JVM) with the security model of VMs (flat memory space, all code in VM untrusted). So you can take a whole blob of legacy C code, compile it to WebAssembly, and run it in a VM that runs with reasonable performance on any architecture (x86, ARM, RISC-V, MIPS, ...).

andreaTP · 6 months ago

A few cool things based on Chicory:

OPA: https://github.com/StyraInc/opa-java-wasm

Integration with Debezium has been launched today too: https://debezium.io/blog/2025/02/24/go-smt/

And SQLite will come next: https://github.com/roastedroot/sqlite4j

bhelx · 6 months ago

Some more interesting use cases in production:

Running python UDFs in Trino: https://trino.io/docs/current/udf/python.html

Running the Ruby parser in Jruby: https://blog.enebo.com/2024/02/23/jruby-prism-parser.html

ncruces · 6 months ago

Looking forward to this reviving NestedVM's pure Java SQLite. It's only been (checks notes…) 20 years.

http://nestedvm.ibex.org/

https://benad.me/blog/2008/1/22/nestedvm-compile-almost-anyt...

To be clear: I'm fully supportive of this effort. NestedVM's SQLite is 100% my inspiration for my Wasm based Go SQLite driver.

evacchi · 6 months ago

also the chicory Extism SDK https://github.com/extism/chicory-sdk and the mcpx4j library used for mcp.run Java integration, see e.g. https://docs.mcp.run/tutorials/mcpx-spring-ai-java

...and Chicory works on Android too https://docs.mcp.run/tutorials/mcpx-gemini-android

vips7L · 6 months ago

How does it compare to graal wasm? https://github.com/oracle/graal/blob/master/wasm/README.md/

evacchi · 6 months ago

take a look at this blog post, these are early results but we collaborated with the Graal team for a fair comparison https://chicory.dev/blog/chicory-1.0.0#the-race-day

bhelx · 6 months ago

Also note, we have the AOT compiler which can target the JVM bytecode directly as well as Dalvik/Android which is experimental but nearly spec complete :)

titzer · 6 months ago

Wizard's slow interpreter also runs on the JVM, albeit it very, very slowly. Have you done any benchmarking against Wizard?

Deleted Comment

remexre · 6 months ago

It'd be interesting to see a benchmark for what the total overhead is for Rust->WASM->Chicory AoT->native-image versus native Rust; I've been pleasantly surprised by the JVM in the past, so I'd hope it'd be a relatively small hit.

bhelx · 6 months ago

Even in interpreter mode, rust wasm programs seem very fast for me on Chicory. I'm not sure if we have any specific benchmarks but the graal team did some and i think it's based on a rust guest program https://chicory.dev/blog/chicory-1.0.0/#the-race-day

andreaTP · 6 months ago

ahaha, that's intriguing! I think there are still some gaps but we are comparing results(with GraalWasm) on Photon here: https://github.com/shaunsmith/wasm-bench Should be easy to build a native image and compare!

dang · 6 months ago

Related. Others?

Chicory 1.0.0-M1: First Milestone Release - https://news.ycombinator.com/item?id=42086590 - Nov 2024 (3 comments)

A Zero-Dependency WebAssembly Runtime for the JVM - https://news.ycombinator.com/item?id=38759030 - Dec 2023 (1 comment)

skyyler · 6 months ago

I'd like to take a moment to appreciate how cute the name is.

Love stuff like that.

bhelx · 6 months ago

Glad you appreciate it! On top of being a Java joke it’s an homage to my home, New Orleans. We still drink coffee with chicory here due to some events during the civil war and then a changed cultural taste. Though the history in this isn’t exactly clear I think https://neworleanshistorical.org/items/show/1393

gregschlom · 6 months ago

Came here to say the same thing, excellent name.

For people who aren't aware, Chicory has long been used (e.g. in Europe during WW2) as a coffee substitute, and Java is another name for coffee, thus Chicory is a substitute for Java.

Edit: I originally thought Chicory was a JVM replacement using WebAssembly (e.g. to run Java applets in modern browsers, using WebAssembly). It appears that it's actually a WebAssembly runtime, to run WebAssembly code on the JVM. So the name is a lot less cool than I thought it was.

nilslice · 6 months ago

it really is a perfect name. credit to u/bhelx!

DrNosferatu · 6 months ago

For some reason, I think that instead a Java runtime written in WebAssembly would be more useful.

titzer · 6 months ago

https://cheerpj.com/

https://thenewstack.io/cheerpj-3-0-run-apps-in-the-browser-w...

DrNosferatu · 6 months ago

Thanks for explaining ;)

bhelx · 6 months ago

There are a few, and they are really interesting! The reason we wrote Chicory though is we're interested in extending the capabilities of existing Java applications through plugins. The intro of this talk explains some of this reasoning: https://www.youtube.com/watch?v=00LYdZS0YlI

giancarlostoro · 6 months ago

Not sure why you're being downvoted. One of the best tools Microsoft made regarding WebAssembly and C# is Blazor. Developers can focus on building web applications and use C# on both the front-end and back-end and drive the UI either server side or WASM without missing a beat. Essentially bypassing the need for JavaScript.

I can only imagine such a capability for Java or other languages would be infinitely useful.

slt2021 · 6 months ago

Google web toolkit was released 18 yers ago that essentially allowed you to create early web2.0 apps (like Gmail) in Java. AJAX and a lot of web2.0 innovations were essentially originated from GWT

https://en.wikipedia.org/wiki/Google_Web_Toolkit

benatkin · 6 months ago

TeaVM https://www.teavm.org/

breadwinner · 6 months ago

Then you'll be run Java in the browser! Wait, isn't that applets?

dingi · 6 months ago

Java runtime without miltithreading? No thanks.

dpratt · 6 months ago

This looks very cool - I'm going to read into the implementation, there's something about producing JVM bytecode from WASM instructions and then having the JVM JIT compile it into native instructions that amuses me.

bhelx · 6 months ago

It's very amusing to me as well. The first thing i did was run and SNES emulator and definitely made me chuckle https://x.com/bhelx/status/1809235314839281900

anentropic · 6 months ago

thinking about that makes we want to see a performance comparison of WASM code running in Chicory vs running on other non-Java WASM hosts