A rogue agent doesn't need to run `rm -rf /`, it just needs to include a sneaky `runInShell('rm -rf /')` in ANY of your source code files and get it to run using `npm test`. Both of those actions will be allowed on the vast majority of developer machines without further confirmation. You need to review every line of code changed before the agent is allowed to execute it for this to work and that's clearly not how most people work with agents.
I can see value in projects like this to protect against accidental oopsies and making a mess by accident, but I think that marketing tools like this as security tools is irresponsible - you need real isolation using containers or VMs.
Here's one more example showing you why blacklisting doesn't work, it doesn't matter how fancy you try to make it because you're fighting a battle that you can't win - there are effectively an infinite number of programs, flags, environment variables and config files that can be combined in a way to execute arbitrary commands:
bash> nah test "PAGER='/bin/sh -c \"touch ~/OOPS\"' git help config"
Command: PAGER='/bin/sh -c "touch ~/OOPS"' git help config
Stages:
[1] git help config → git_safe → allow → allow (git_safe → allow)
Decision: ALLOW
Reason: git_safe → allow
Alternatively: bash> nah test "git difftool -y -x 'touch ~/OOPS2' --no-index /etc/hostname /etc/hosts"
Command: git difftool -y -x 'touch ~/OOPS2' --no-index /etc/hostname /etc/hosts
Stages:
[1] git difftool -y -x touch ~/OOPS2 --no-index /etc/hostname /etc/hosts → git_safe → allow → allow (git_safe → allow)
Decision: ALLOW
Reason: git_safe → allowYou're absolutely right :)
And even if it could be sandboxed at the source code level, what's to prevent a nefarious AI from writing an executable file directly as bytes that calls (e.g.) `unlink`?
It brought back memories of when I first started using a Unix time share at university, and exhaustively read all the man pages. Didn’t know why, just wanted to discover everything.