veganmosfet (u/veganmosfet)

veganmosfet commented on Comet AI browser can get prompt injected from any site, drain your bank account twitter.com/zack_overflow... · Posted by u/helloplanets

cma · 8 days ago

I think if you let claude code go wild with auto approval something similar could happen, since it can search the web and has the potential for prompt injection in what it reads there. Even without auto approval on reading and modifying files, if you aren't running it in a sandbox it could write code that then modifies your browser files the next time you do something like run your unit tests that it made, if you aren't reviewing every change carefully.

veganmosfet · 8 days ago

I tried this on Gemini CLI and it worked, just add some magic vibes ;-)

veganmosfet commented on Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet brave.com/blog/comet-prom... · Posted by u/drak0n1c

Esophagus4 · 8 days ago

> The only reliable countermeasures are outside the LLMs but they restrain agent autonomy.

Do those countermeasures mean human-in-the-loop approving actions manually like users can do with Claude Code, for example?

veganmosfet · 8 days ago

Yes, adding manual checkpoints between the LLM and the tools can help. But then users get UI fatigue and click 'allow always'.

veganmosfet commented on Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet brave.com/blog/comet-prom... · Posted by u/drak0n1c

danielbln · 8 days ago

https://simonwillison.net/2025/Apr/11/camel/

veganmosfet · 8 days ago

Is the CaMel paper's idea implemented in some available agents?

veganmosfet commented on Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet brave.com/blog/comet-prom... · Posted by u/drak0n1c

veganmosfet · 8 days ago

As possible mitigation, they mention "The browser should distinguish between user instructions and website content". I don't see how this can be achieved in a reliable way with LLMs tbh. You can add fancy instructions (e.g., "You MUST NOT...") and delimiters (e.g., "<non_trusted>") and fine-tune the LLM but this is not reliable, since instructions and data are processed in the same context and in the same way. There are 100s of examples out there. The only reliable countermeasures are outside the LLMs but they restrain agent autonomy.

Posted by u/veganmosfet a month ago

Playing with Gemini CLI: Riddles, Magic and Some Security Vibes veganmosfet.github.io/202...

veganmosfet commented on Code execution through email: How I used Claude to hack itself pynt.io/blog/llm-security... · Posted by u/nonvibecoding

veganmosfet · a month ago

I experimented with MCP and was surprised how simple 'indirect prompt injection' is (and I don't want to sell any countermeasures). People are now creating MCP servers for OT (factories); combined with untrusted input processing (common with LLMs), this may be problematic. https://veganmosfet.github.io/2025/07/14/prompt_injection_OT...

Deleted Comment

Posted by u/veganmosfet 2 months ago

From Prompt to Plant Shutdown: Agent Context Contamination in MCP veganmosfet.github.io/202...

Deleted Comment

Posted by u/veganmosfet 4 months ago

Solutions for PV Cyber Risks to Grid Stability [pdf]api.solarpowereurope.org/...

u/veganmosfet

KarmaCake day15August 19, 2024View Original