Show HN: Execute JavaScript in a WebAssembly QuickJS sandbox

jitl · a year ago

Hi, I’m the author of the underlying quickjs-emscripten runtime library. I like your ergonomic kind of “standard library” for quickjs-emscripten :)

Did you try running in the browser or with a bundler? I think accepting the variant name as a string you pass to import(variantName) dynamically may not play well with Webpack et al.

EDIT: SECURITY WARNING: this library exposes the ability for the guest (untrusted) code to `fetch` with the same cookies as the host `fetch` function. You must not run untrusted code if enabling `fetch`. Library should come with a big blinking warning about what is safe and unsafe to enable when running untrusted code. It’s not a “sandbox” if the sandboxed code can call arbitrary HTTP APIs authenticated as the host context!

The reason quickjs-emscripten is low-level and avoids magic is so I can confidently claim that the APIs it does provide are secure. I generally reject feature requests for magical serialization or easy network/filesystem access because that kind of code is a rich area for security mistakes. When you run untrusted code, you should carefully audit the sandbox itself, but also audit all the code you write to expose APIs to the sandbox.

In this case a comment from an other HN user asking about Fetch cookies tipped me off to the potential security issue.

More reading:

Figma blog posts on plugin sandbox security:

- https://www.figma.com/blog/how-we-built-the-figma-plugin-sys...

- https://www.figma.com/blog/an-update-on-plugin-security/

Quickjs-emscripten README: https://github.com/justjake/quickjs-emscripten

silenced_trope · a year ago

Would using an iframe (with/without this lib) prevent the fetch issue or is that still a problem there?

svieira · a year ago

A same-domain iframe would not, but a sandboxed one with the appropriate permissions locked down (or one on another domain) would.

AlexErrant · a year ago

There are many ways to sandbox Javascript, both serverside and browser-side.

Are there any ways to "sandbox" DOM access? I.e. give untrusted 3rd parties access to a DOM element in a predefined spot? AFAIK the only tech that allows for this is iframes, which are unfortunately heavy and slow. I'm writing an app that can host plugins, and unfortunately, I think giving plugins DOM access means they can now literally do literally _anything_.

spankalee · a year ago

Salesforce does this with a combination of web components, with a patched up ShadowRoot so that code with a reference to the shadow root can't walk into the rest of the document, and a secure evaluator function related to SES (Secure EcmaScript) to limit the globals the untrusted script has access too.

The secure evaluator is wild. I think this is the heart of it: https://github.com/Agoric/realms-shim/blob/v1.1.0/src/evalua...

There's also an idea for isolated web components to solve this in the platform: https://github.com/WICG/webcomponents/issues/1002

m1el · a year ago

Salesforce sandboxing is too easy to escape. Last time I needed to implement some feature for Salesforce, I've encountered 4 different escapes. It was also horrible dev experience.

cxr · a year ago

You can also check out the discussion for Figma's earlier work on their plugin system, which is what inspired jitl (above) to create quickjs-emscripten. Previously:

How to build a plugin system on the web and also sleep well at night. <https://news.ycombinator.com/item?id=20770105> 2019 August 22. 89 comments.

cxr · a year ago

The closest thing I know of is Allen Wirfs-Brock's jsmirrors prototype, but he never got to speccing out anything for DOM (and never really intended to as far as I know). Just capabilities for JS-the-programming-system.

You could look at jsmirrors for inspiration and take a crack at some sort of "dommirrors" yourself, but it's big undertaking. (There's a roundabout way to go about using jsmirrors as-is to kind of achieve what you want, but it's not ergonomic.)

That being said, giving access to the DOM, even mediated/simulated, is almost certainly not what you really want. Figure out what you _actually_ want to allow the other side to do, and then just give them a capability that lets them do it. (For example, to let them add a button somewhere, you might think you need to give them an anchor point (parent element) where they can insert it and let them use `document.createElement` to make the DOM node that they're going to put there. But you don't actually want that—for them to have access to `document.createElement`, etc. What you want is for them to have an add-button capability. So give them that—go implement `addButton`.)

Moar: <https://news.ycombinator.com/item?id=30703531#30706060>

PS: don't listen to anyone who comes along and says that this is what CSP is for. It's not. (If we're being accurate, even for what CSP really is for, it's poorly designed, user-hostile junk and should never have been implemented or extended as far as it has been.) It's dangerous to depend on it.

jitl · a year ago

Big plus-one to this:

> That being said, giving access to the DOM, even mediated/simulated, is almost certainly not what you really want. Figure out what you _actually_ want to allow the other side to do, and then just give them a capability that lets them do it. (For example, to let them add a button somewhere, you might think you need to give them an anchor point (parent element) where they can insert it and let them use `document.createElement` to make the DOM node that they're going to put there. But you don't actually want that—for them to have access to `document.createElement`, etc. What you want is for them to have an add-button capability. So give them that—go implement `addButton`.)

For a plugin model, I’d suggest providing a high-level UI library to add panels & actions rendered by first-party UI components in specific areas which communicate with plugin JS running in quickjs. Many plugins that integrate with the 3rd-party’s own service will also want an iframe for embedding 3rd-party content, so you can provide that as well since iframe is sandboxed and the use-case makes sense. But scripting/plugin code shouldn’t be reading or writing to the DOM, it should be making requests and responding to request from the host application APIs synchronously in-process.

That’s the way I think about it anyways.

jitl · a year ago

The only really safe way to approach this would be to give the 3rd party code an off-domain iframe with the sandbox attributes configured. You can still measure the DOM content size from the parent page to resuze the iframe to certain limits to integrate it more seamlessly into your app UI.

Depending on the level of exposure and trust between your users, you’ll need to watch out for impersonation/phishing and clickjacking attempts in the iframe. Ideally you can lock down the frame so it can’t make any web requests at all (which implies no image loading), which means there’s no way to exfiltrate data from the frame if, for example, they convinced the user to enter their password into a fake password form.

The main way to restrict what kinds of resources an iframe can request is via content-security-policy, which you can use to turn off all 3rd party images, scripts, etc.

https://developer.mozilla.org/en-US/docs/Web/API/HTMLIFrameE...

https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP

You should also enable these other sandbox attributes and disable access to privacy sensitive DOM APIs like the webcam etc:

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/if...

https://developer.mozilla.org/en-US/docs/Web/Security/IFrame...

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Pe...

mattigames · a year ago

The only feasible way would be to add a API where they send you the html they want to render -as a string- and you parse it using one of the many libraries to do so, then recreate the Dom based on the parsed data, that way you can whitelist the html elements and the attributes you want to allow, if you want to allow listening to native DOM events that complicates things but is not impossible, you would need something like an API that accepts the name of event (string) and the id of the element that would receive it, you would then listen to that event in the real DOM and replicate such event inside the JS sandbox you may be using (where they must have access to the aforementioned API)

austin-cheney · a year ago

Off the top of my head the way I would this:

* On the back end request the third party code and then associate that code with a hash sequence.

+ On the backend dynamically modify the html such that there is a div tag with an id whose value is the hash sequence. Also modify the html such that there is a script tag that requests the third party code from your domain. For tracking purposes you add the hash value to a data attribute on that script tag.

* On the back end modify that third party code such that all instances of document. and window. are replaced by document.getElementById(hash_value). and all query selectors begin with #hash_value.

* You would to replace .parentNode in the Element prototype with a custom property that checks for and drops escape from the providdd container.

Then send the html document to the browser. If the third party code breaks that is ok. The constraints should be communicated to the third party and it’s up to them to test their own code before sending it to your server. All you care about is that their code does not escape the dynamically provided container. Test this regularly on your side to look for security violations.

Also, this may not work, but it would be fun to experiment with.

chmod775 · a year ago

In the context of this conversation, which is about running untrusted code, this has about a million holes.

The only way DOM access can become secure is if either browsers add support for sandboxing in such a way, or you have your own sandbox, like OPs, and provide DOM modification APIs within it that go through rigorous validation before you pass anything on to the browser.

Trying to sandbox with find/replace will never work (unless you replace the entire script with an empty string).

dawnerd · a year ago

That seems like it’s pretty fragile though. I’d be really worried about all the weird edge cases

frabjoused · a year ago

Coincidentally I was trying out quickjs last week and ultimately ended up settling on isolated-vm instead as both met our security contrasts, however isolated-vm ended up being far more performant in terms of setup, teardown and eval execution overhead.

emurlin · a year ago

Interesting approach! As an author of another JS sandbox library[1] that uses workers for isolation plus some JS environment sanitisation techniques, I think that interpreting JS (so, JS-in-JS, or as in this case, JS-in-WASM) gives you the highest level of isolation, and also doesn't directly expose you to bugs in the host JS virtual machine itself. Since you're targeting Node, this is perhaps even more important because (some newer developments notwithstanding) Node.js doesn't really seem to have been designed with isolation and sandboxing in mind (unlike, say, Deno).

From the API, I don't see if `createRuntime` allows you to define calls to the host environment (other than for `fetch`). This would be quite a useful feature, especially because you could use it to restrict communication with the outside world in a controlled way, without it being an all-or-nothing proposition.

Likewise, it doesn't seem to support the browser (at least, running a quick check with esm.sh). I think that that could be a useful feature too.

I'll run some tests as I'm curious what the overhead is in this case, but like I said, this sounds like a pretty solid approach.

[1] @exact-realty/lot

jitl · a year ago

I’m the author of the underlying quickjs-emscripten library. It supports the browser (specifically tested with ESM.sh), as well as Cloudflare Workers, NodeJS, Deno: https://github.com/justjake/quickjs-emscripten?tab=readme-ov...

It has APIs for exposing host functions, calling guest functions, custom module loaders, etc: https://github.com/justjake/quickjs-emscripten?tab=readme-ov...

API docs for newFunction: https://github.com/justjake/quickjs-emscripten/blob/main/doc...

brigadier132 · a year ago

Wow cloudflare workers support is actually super cool. How does it limit memory usage?

FpUser · a year ago

CPU got too fast so let's run interpreter inside interpreter.

jitl · a year ago

i wouldn’t say “performance” as an advantage of running JS in QuickJS. QuickJS isn’t competitive at all with the host JS VM, although I guess it’s faster than older C interpreters, or an interpreter implemented in JavaScript.

math_dandy · a year ago

I suppose you get performance benefits if the the time it takes to start up a nodejs process dominates the execution time of the script. This is probably the case for a decent proportion of “serverless function” type scripts.

jitl · a year ago

This library expects to run inside a Javascript runtime like NodeJS, so you're always going to pay for the enclosing Javascript runtime to start.

throwitaway1123 · a year ago

Yup, AWS actually created a JS runtime called LLRT (Low Latency Runtime) based on QuickJS exactly for this purpose (reducing Lambda function cold start time). The Syntax podcast just released an episode with one of the developers behind LLRT.

leohart · a year ago

This is awesome. With this, I would be able to run JS code that my user provides. I have been looking for a way to bundle my user Typescript code using a bundler in a sandbox environment. Any recommendation on ways to run a bundler (webpack/...) in QuickJS?

idle_zealot · a year ago

I don't know about using QJS, but if you want to run a bundler in the browser that sounds like the sort of thing that WebContainers[1] were built for.

[1]: https://webcontainers.io/

brigadier132 · a year ago

Very cool. Since this is compiled to wasm can this run in the browser? It would be interesting if it could and still make fetch requests without attaching cookies to the request.