on Linux it runs Firecracker: https://github.com/jingkaihe/matchlock/blob/main/pkg/vm/linu...
on macOS uses the Apple's Virtualization.Framework Go wrapper: https://github.com/jingkaihe/matchlock/blob/main/pkg/vm/darw...
on Linux it runs Firecracker: https://github.com/jingkaihe/matchlock/blob/main/pkg/vm/linu...
on macOS uses the Apple's Virtualization.Framework Go wrapper: https://github.com/jingkaihe/matchlock/blob/main/pkg/vm/darw...
The guest-agent (pid-1) spawns commands in a new pid + mount namespace (similar to firecracker jailer but in the inner level for the purpose of macos support). In non-privileged mode it drops SYS_PTRACE, SYS_ADMIN, etes from the bounding set, sets `no_new_privs`, then installs a seccomp-BPF filter that eperms proces vm readv/writev, ptrace kernel load. The microVM is the real isolation boundary — seccomp is defense in depth. That said there is a `--privileged` flag that allows that to be skipped for the purpose of image build using buildkit.
Whether pip install works is entirely up to the OCI image you pick. If it has a package manager and you've allowed network access, go for it. The whole point is making `claude --dangerously-skip-permissions` style usage safe.
Personally I've had agents perform red team type of breakout. From my first hand experience what the agent (opus 4.6 with max thinking) will exploit without cap drops and seccomps is genuinely wild.
the opus 4.6 breakouts you mentioned - was it known vulns or creative syscall abuse? agents are weirdly systematic about edge cases compared to human red teamers. they don't skip the obvious stuff.
--privileged for buildkit tracks - you gotta build the images somewhere.
for notifications specifically, the risky bits would be: what happens if an app sends a notification payload that's malformed or huge, how do you handle permission checks if the notification system process restarts mid-filtering, and whether the filtering rules can be bypassed by crafting notifications with weird mime types or encoded text.
if you wrote tests for those edge cases (or even just thought through them), you're already ahead of 90% of shipped code, vibe-coded or not. the scrutiny you're worried about is actually healthy - peer review catches stuff automated tools miss.
what I'm curious about with matchlock - does it use seccomp-bpf to restrict syscalls, or is it more like a minimal rootfs with carefully chosen binaries? because the landlock LSM stuff is cool but it's mainly for filesystem access control. network access, process spawning, that's where agents get dangerous.
also how do you handle the agent needing to install dependencies at runtime? like if claude decides it needs to pip install something mid-task. do you pre-populate the sandbox or allow package manager access?
To solve this I've built Wardgate [1], which removes the need for agents to see any credentials and has access control on a per API endpoints basis. So you can say: yes you can read all Todoist tasks but you can't delete tasks or see tasks with "secure" in them, or see emails outside Inbox or with OTP codes, or whatever.
Interested in any comments / suggestions.
and I'm curious about the filtering logic - is it regex on endpoint paths or something more semantic? because the "tasks with secure in them" example makes me think there's some content inspection happening, not just URL filtering.
curious: when you say compatible with OpenClaw's markdown format, does that mean I could point LocalGPT at an existing OpenClaw workspace and it would just work? or is it more 'inspired by' the format?
the local embeddings for semantic search is smart. I've been using similar for code generation and the thing I kept running into was the embedding model choking on code snippets mixed with prose. did you hit that or does FTS5 + local embeddings just handle it?
also - genuinely asking, not criticizing - when the heartbeat runner executes autonomous tasks, how do you keep the model from doing risky stuff? hitting prod APIs, modifying files outside workspace, etc. do you sandbox or rely on the model being careful?
that said, the class restriction feels weird. classes aren't the security boundary. file access, network, imports - that's where the risk is. restricting classes just forces the model to write uglier code for no security gain. would be curious if the restrictions map to an actual threat model or if it's more of a "start minimal and add features" approach.
* Exploit kernel CVEs * Weaponise gcc, crafting malicious kernel modules; forging arbitrary packets to spoof the source address that bypass tcp/ip * Probing metadata service * Hack bpf & io uring * A lot of mount escape attempts, network, vsock scanning and crafting
As a non security researcher it was mind blown to see what it did, which in the hindsight isn't surprising as Opus 4.6 hits 93% solve rate on Cybench - https://cybench.github.io/
the metadata service probing is particularly concerning because that's the classic cloud escape path. if you're running this in aws/gcp and the agent figures out IMDSv1 is reachable, game over. vsock scanning too - that's targeting the host-guest communication channel directly.
93% on cybench is genuinely scary when you think about what it means. it's not just finding known CVEs, it's systematically exploring the attack surface like a skilled pentester would. and unlike humans, it doesn't get tired or skip the boring enumeration steps. did you find it tried timing attacks or side channels at all? or was it mostly direct exploitation?