Readit News logoReadit News
alexbecker commented on Ask HN: What Are You Working On? (Nov 2025)    · Posted by u/david927
alexbecker · a month ago
I'm working on _prompt injection_, the problem where LLMs can't reliably distinguish between the user's instructions and untrusted content like web search results.

Just published a blog post a few minutes ago: https://alexcbecker.net/blog/prompt-injection-benchmark.html

alexbecker commented on Comet AI browser can get prompt injected from any site, drain your bank account   twitter.com/zack_overflow... · Posted by u/helloplanets
chasd00 · 4 months ago
Can’t the connections and APIs that an LLM are given to answer queries be authenticated/authorized by the user entering the query? Then the LLM can’t do anything the asking user can’t do at least. Unless you have launch the icbm permissions yourself there’s no way to get the LLM to actually launch the icbm.
alexbecker · 4 months ago
Generally the threat model is that a trusted user is trying to get untrusted data into the system. E.g. you have an email monitor that reads your emails and takes certain actions for you, but that means it's exposed to all your emails which may trick the bot into doing things like forwarding password resets to a hacker.
alexbecker commented on Comet AI browser can get prompt injected from any site, drain your bank account   twitter.com/zack_overflow... · Posted by u/helloplanets
Terr_ · 4 months ago
The LLM is basically an iterative function going guess_next_text(entire_document). There is no algorithm-level distinction at all between "system prompt" or "user prompt" or user input... or even between its own prior output. Everything is concatenated into one big equally-untrustworthy stream.

I suspect a lot of techies operate with a subconscious good-faith assumption: "That can't be how X works, nobody would ever built it that way, that would be insecure and naive and error-prone, surely those bajillions of dollars went into a much better architecture."

Alas, when it comes to day's the AI craze, the answer is typically: "Nope, the situation really is that dumb."

__________

P.S.: I would also like to emphasize that even if we somehow color-coded or delineated all text based on origin, that's nowhere close to securing the system. An attacker doesn't need to type $EVIL themselves, they just need to trick the generator into mentioning $EVIL.

alexbecker · 4 months ago
There have been attempts like https://arxiv.org/pdf/2410.09102 to do this kind of color-coding but none of them work in a multi-turn context since as you note you can't trust the previous turn's output
alexbecker commented on Comet AI browser can get prompt injected from any site, drain your bank account   twitter.com/zack_overflow... · Posted by u/helloplanets
hoppp · 4 months ago
Maybe treat prompts like it was SQL strings, they need to be sanitized and preferably never exposed to external dynamic user input
alexbecker · 4 months ago
The problem is there is no real way to separate "data" and "instructions" in LLMs like there is for SQL
alexbecker commented on Comet AI browser can get prompt injected from any site, drain your bank account   twitter.com/zack_overflow... · Posted by u/helloplanets
alexbecker · 4 months ago
I doubt Comet was using any protections beyond some tuned instructions, but one thing I learned at USENIX Security a couple weeks ago is that nobody has any idea how to deal with prompt injection in a multi-turn/agentic setting.
alexbecker commented on Ask HN: What Are You Working On? (June 2025)    · Posted by u/david927
alexbecker · 6 months ago
Lately I've been trying to detect/mitigate prompt injection attacks. Wrote a blog post about why it's hard: https://alexcbecker.net/blog/prompt-injection.html
alexbecker commented on The necessity of Nussbaum   aeon.co/essays/why-readin... · Posted by u/rbanffy
alexbecker · 9 months ago
After reading Judith Butler for a class in college, reading "Professor of Parody" was such a breath of fresh air. Nussbaum is a clear thinker who doesn't take BS kindly.

u/alexbecker

KarmaCake day1107August 25, 2014
About
https://alexcbecker.net
View Original