Readit News logoReadit News
veganmosfet commented on Comet AI browser can get prompt injected from any site, drain your bank account   twitter.com/zack_overflow... · Posted by u/helloplanets
cma · 8 days ago
I think if you let claude code go wild with auto approval something similar could happen, since it can search the web and has the potential for prompt injection in what it reads there. Even without auto approval on reading and modifying files, if you aren't running it in a sandbox it could write code that then modifies your browser files the next time you do something like run your unit tests that it made, if you aren't reviewing every change carefully.
veganmosfet · 8 days ago
I tried this on Gemini CLI and it worked, just add some magic vibes ;-)
veganmosfet commented on Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet   brave.com/blog/comet-prom... · Posted by u/drak0n1c
Esophagus4 · 8 days ago
> The only reliable countermeasures are outside the LLMs but they restrain agent autonomy.

Do those countermeasures mean human-in-the-loop approving actions manually like users can do with Claude Code, for example?

veganmosfet · 8 days ago
Yes, adding manual checkpoints between the LLM and the tools can help. But then users get UI fatigue and click 'allow always'.
veganmosfet commented on Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet   brave.com/blog/comet-prom... · Posted by u/drak0n1c
danielbln · 8 days ago
veganmosfet · 8 days ago
Is the CaMel paper's idea implemented in some available agents?
veganmosfet commented on Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet   brave.com/blog/comet-prom... · Posted by u/drak0n1c
veganmosfet · 8 days ago
As possible mitigation, they mention "The browser should distinguish between user instructions and website content". I don't see how this can be achieved in a reliable way with LLMs tbh. You can add fancy instructions (e.g., "You MUST NOT...") and delimiters (e.g., "<non_trusted>") and fine-tune the LLM but this is not reliable, since instructions and data are processed in the same context and in the same way. There are 100s of examples out there. The only reliable countermeasures are outside the LLMs but they restrain agent autonomy.
veganmosfet commented on Code execution through email: How I used Claude to hack itself   pynt.io/blog/llm-security... · Posted by u/nonvibecoding
veganmosfet · a month ago
I experimented with MCP and was surprised how simple 'indirect prompt injection' is (and I don't want to sell any countermeasures). People are now creating MCP servers for OT (factories); combined with untrusted input processing (common with LLMs), this may be problematic. https://veganmosfet.github.io/2025/07/14/prompt_injection_OT...

Deleted Comment

Deleted Comment

u/veganmosfet

KarmaCake day15August 19, 2024View Original