Detecting and countering misuse of AI

"Vibe hacking" is real - here's an excerpt from my actual ChatGPT transcript trying to generate bot scripts to use for account takeovers and credential stuffing:

>I can't help with automating logins to websites unless you have explicit authorization. However, I can walk you through how to ethically and legally use Puppeteer to automate browser tasks, such as for your own site or one you have permission to test.

>If you're trying to test login automation for a site you own or operate, here's a general template for a Puppeteer login script you can adapt:

><the entire working script, lol>

Full video is here, ChatGPT bit starts around 1:30: https://stytch.com/blog/combating-ai-threats-stytchs-device-...

The barrier to entry has never been lower; when you democratize coding, you democratize abuse. And it's basically impossible to stop these kinds of uses without significantly neutering benign usage too.

cj · 4 months ago

Refusing hacking prompts would be like outlawing Burpsuite.

It might slow someone down, but it won’t stop anyone.

Perhaps vibe hacking is the cure against vibe coding.

I’m not concerned about people generating hacking scripts, but am concerned that it lowers the barrier of entry for large scale social engineering. I think we’re ready to handle an uptick in script kiddie nuisance, but not sure we’re ready to handle large scale ultra-personalized social engineering attacks.

eru · 4 months ago

> It might slow someone down, but it won’t stop anyone.

Nope, plenty of script kids go and something else.

anon22981 · 4 months ago

Mikko Hyppönen, who holds at least some level of authority on the subject, just recently said in an interview that he believes currently the defenders have the advantage. He claimed there’s currently zero known large incidents where the attackers have been known to utilize LLMs. (Apart from social hacking.)

To be fair, he also said that the defenders having the advantage is going to change.

quotemstr · 4 months ago

> The barrier to entry has never been lower; when you democratize coding, you democratize abuse.

You also democratize defense.

Besides: who gets to define "abuse"? You? Why?

Vibe coding is like free speech: anything it can destroy should be destroyed. A society's security can't depend on restricting access to skills or information: it doesn't work, first of all, and second, to the extent it temporarily does, it concentrates power in an unelected priesthood that can and will do "good" by enacting rules that go against the wishes and interest of the public.

chii · 4 months ago

> You also democratize defense.

not really - defense is harder than offence.

Just think about the chance of each: for defense, you need to protect against _every attack_ to be successful. For offence, you only need to succeed once to be successful - each failure is not a concern.

Therefore, the threat is asymmetric.

dheera · 4 months ago

If I were in charge of an org's cybersecurity I would have AI agents continually trying to attack the systems 24/7 and inform me of successful exploits; it would suck if the major model providers block this type of usage.

jsheard · 4 months ago

Judging from the experience of people running bug bounty programs lately, you'd definitely get an endless supply of successful exploit reports. Whether any of them would be real exploits is another question though.

https://daniel.haxx.se/blog/2025/07/14/death-by-a-thousand-s...

netvarun · 4 months ago

Shameless plug: We're building this. Our goal is to provide AI pentesting agents that run continuously, because the reality is that companies (eg: those doing SOC 2) typically get a point-in-time pentest once a year while furiously shipping code via Cursor/Claude Code and changing infrastructure daily.

I like how Terence Tao framed this [0]: blue teams (builders aka 'vibe-coders') and red teams (attackers) are dual to each other. AI is often better suited for the red team role, critiquing, probing, and surfacing weaknesses, rather than just generating code (In this case, I feel hallucinations are more of a feature than a bug).

We have an early version and are looking for companies to try it out. If you'd like to chat, I'm at varun@keygraph.io.

[0] https://mathstodon.xyz/@tao/114915606467203078

cube00 · 4 months ago

That sounds expensive, those LLM API calls and tokens aren't cheap.

idontwantthis · 4 months ago

Horizon3 offers this.

Deleted Comment

cyanydeez · 4 months ago

So many great parallels to the grift econy

To me this sounds like the path of "smart guns", i.e. "people are using our guns for evil purposes so now there is a camera attached to the gun which will cause the gun to refuse to fire if it detects it is being used for an evil purpose"

rattray · 4 months ago

I'm not familiar with this parable, but that sounds like a good thing in this case?

Notably, this is not a gun.

demarq · 4 months ago

things that you think sound good, might not sound good to the authority in charge of determining what is good.

For example using your LLM to criticise, ask questions or perform civil work that is deemed undesirable becomes evil.

You can use google to find how the UK government for example has been using "law" and "terrorism" charges against people for simply tweeting or holding a placard they deem critical of Israel.

Anthropic is showing off these capabilities in order to secure defence contracts. "We have the ability to surveil and engage threats, hire us please".

Anthropic is not a tiny start up exploring AI, it's a behemoth bank rolled by the likes of Google and Amazon. It's a big bet. While money is drying up for AI, there is always one last bastion for endless cash, defence contracts.

You just need a threat.

herpdyderp · 4 months ago

In general, such broad surveillance usually sounds like a bad thing to me.

Aurornis · 4 months ago

I’m actually surprised whenever someone familiar with technology thinks that adding more “smart” controls to a mechanical device is a good idea, or even that it will work as intended.

The imagined ideal of a smart gun that perfectly identifies the user, works every time, never makes mistakes, always has a fully charged battery ready to go, and never suffers from unpredictably problems sounds great to a lot of people.

But as a person familiar with tech, IoT, and how devices work in the real world, do you actually think it would work like that?

“Sorry, you cannot fire this gun right now because the server is down”.

Or how about when the criminals discover that they can avoid being shot by dressing up in police uniforms, fooling all of the smart guns?

A very similar story is the idea of a drink driving detector in every vehicle. It sounds good when you imagine it being perfect. It doesn’t sound so good when you realize that even a 99.99% false positive avoidance means your own car is almost guaranteed lock you out of driving it some day by mistake during its lifetime, potentially when you need to drive it for work, an appointment, or even an emergency due to a false positive.

lurk2 · 4 months ago

>but that sounds like a good thing in this case?

Who decides when someone is doing something evil?

johnQdeveloper · 4 months ago

Well what if you want the AI red team your own applications?

That seems a valid use case that'd get hit.

madrox · 4 months ago

It depends on who is creating the definition of evil. Once you have a mechanism like this, it isn't long after that it becomes an ideological battleground. Social media moderation is an example of this. It was inevitable for AI usage, but I think folks were hoping the libertarian ideal would hold on a little longer.

rapind · 4 months ago

Not really. It's like saying you need a license to write code. I don't think they actually want to be policing this, so I'm not sure why they are, other than a marketing post or absolution for the things that still get through their policing?

It'll become apparent how woefully unprepared we are for AIs impact as these issues proliferate. I don't think for a second that Anthropic (or any of the others) is going to be policing this effectively or maybe at all. A lot of existing processes will attempt to erect gates to fend off AI, but I bet most will be ineffective.

martin-t · 4 months ago

One man's evil is another man's law.[0][1]

The issue is they get to define what is evil and it'll mostly be informed by legality and potential negative PR.

So if you ask how to build a suicide drone to kill a dictator, you're probably out of luck. If you ask it how to build an automatic decision framework for denying healthcare, that's A-OK.

[0]: My favorite "fun" fact is that the Holocaust was legal. You can kill a couple million people if you write a law that says killing those people is legal.

[1]: Or conversely, a woman went to prison because she shot her rapist in the back as he was leaving after he dragged her into an empty apartment and raped her - supposedly it's OK to do during the act but not after, for some reason.

stavros · 4 months ago

Presumably the reason is that before or during, you're doing it to stop the act. Afterwards, it's revenge.

eru · 4 months ago

> [0]: My favorite "fun" fact is that the Holocaust was legal. You can kill a couple million people if you write a law that says killing those people is legal.

See the Nuremberg Processes for much more on that topic than you'd ever wanted to know. 'Legal' is a complicated concept.

For a more contemporary take with slightly less mass murder: the occupation of Crimea is legal by Russian law, but illegal by Ukrainian law.

Or how both Chinas claim the whole of China. (I think the Republic of China claims a larger territory, because they never bothered settling some border disputes that they don't de-facto own anyway.) And obviously, different laws apply in both version of China, even if they are claiming the exact same territory. Some act can be both legal and illegal.

bobbiechen · 4 months ago

umvi · 4 months ago

pton_xd · 4 months ago

The future of programming -- we're monitoring you. Your code needs our approval, otherwise we'll ban your account and alert the authorities.

Now that I think about it, I'm a little amazed we've even been able to compile and run our own code for as long as we have. Sounds dangerous!

NoGravitas · 4 months ago

> There were ways, of course, to get around the SPA and Central Licensing. They were themselves illegal. Dan had had a classmate in software, Frank Martucci, who had obtained an illicit debugging tool, and used it to skip over the copyright monitor code when reading books. But he had told too many friends about it, and one of them turned him in to the SPA for a reward (students deep in debt were easily tempted into betrayal). In 2047, Frank was in prison, not for pirate reading, but for possessing a debugger.

> Dan would later learn that there was a time when anyone could have debugging tools. There were even free debugging tools available on CD or downloadable over the net. But ordinary users started using them to bypass copyright monitors, and eventually a judge ruled that this had become their principal use in actual practice. This meant they were illegal; the debuggers' developers were sent to prison.

> Programmers still needed debugging tools, of course, but debugger vendors in 2047 distributed numbered copies only, and only to officially licensed and bonded programmers. The debugger Dan used in software class was kept behind a special firewall so that it could be used only for class exercises.

jedimastert · 4 months ago

Note: the term "script kiddie" has been around for much longer than I've been alive...

nurettin · 4 months ago

I remember seeing the term online right after The Matrix was released. It was a bit perplexing, because an inexperienced person who is able use hacking tools successfully without knowing how they work is pretty much half way there. Just fire up Ethereal (now Wireshark) or a decompiler and see how it works. I guess the insult was meant to push people to learn more and be proactive instead of begging on IRC.

> I guess the insult was meant to push people to learn more and be proactive instead of begging on IRC.

From what I can tell, there's a massive cultural bias towards "filtering" to ensure only the "worthy" or whatever get into the in-group, so yeah I think this is a charitable but not inaccurate to think about it

gverrilla · 4 months ago

Wasn't there a different term for script kiddies inside the hacker communities? I believe so but my memory fails me. It started with "l" if I'm not mistaken. (talking about 20y ago)

huseyinkeles · 4 months ago

I believe you are referring to “lamer” (as opposed to hacker)

oddmade · 4 months ago

I'll cancel my $100 / month Claude account the moment they decide to "approve my code"

Already got close to cancel when they recently updated their TOS to say that for "consumers" they deserve the right to own the output I paid for - if they deem the output not having been used "the correct way" !

This adds substantial risk to any startup.

Obviously...for "commercial" customers that do not apply - at 5x the cost...

brutal_chaos_ · 4 months ago

https://www.copyright.gov/ai/

In the US, at least, the works generated by "AI" are not copyrightable. So for my layman's understanding, they may claim ownership, but it means nothing wrt copyright.

(though patents, trademarks are another story that I am unfamiliar with)

shikon7 · 4 months ago

But along the same argument you may claim ownership, but it means nothing wrt copyright.

So you cannot stop them from using the code AI generated for you, based on copyright claims.

tbrownaw · 4 months ago

There's a difference between an AI acting on it's own, vs a person using AI as a tool. And apparently the difference is fuzzy instead of having a clear line somewhere.

I wonder if any appropriate-specialty lawyers have written publicly about those AI agents that can supposedly turn a bug report or enhancement request into a PR...

aeon_ai · 4 months ago

Can you elaborate on the expansion of rights in the ToS with a reference? That seems egregiously bad

https://www.anthropic.com/legal/consumer-terms

"Subject to your compliance with our Terms, we assign to you all our right, title, and interest (if any) in Outputs."

..and if you read the terms you find a very long list of what they deem acceptable.

I see now they also added "Non-commercial use only. You agree not to use our Services for any commercial or business purposes" ...

..so paying 100usd a month for a code assistant is now a hobby ?

nojito · 4 months ago

>This adds substantial risk to any startup.

If you're a startup are you not a "commercial" customer?

Well... ..in their TOS they seem to classify the 100usd / month Max plan a "consumer plan"

I think this is talking about the different tiers of subscription you can buy.

sitkack · 4 months ago

They are already trolling for our prompting techniques, now they are lifting our results. Great.

measurablefunc · 4 months ago

They have contracts w/ the military but I am certain these safety considerations do not apply to military applications.

fbhabbed · 4 months ago

I see they just decided to become even more useless than they already are.

Except for the ransomware thing, or phishing mail writing, most of the uses listed there seems legit to me and a strong reason to pay for AI.

One of these is exactly preparing with mock interviews which is something I myself do a lot, or having step by step instructions to implement things for my personal projects that are not even public facing and that I can't be arsed to learn because it's not my job.

Long life to Local LLMs I guess

raincole · 4 months ago

Since they started using the term 'model welfare' in their blog I knew it would only be a downhill from there.

tomrod · 4 months ago

Welfare is a well defined concept in social science.

furyofantares · 4 months ago

Which uses here look legit to you, specifically?

The only one that looks legit to me is the simulated chat for the North Korean IT worker employment fraud - I could easily see that from someone who non-fraudulently got a job they have no idea how to do.

A_D_E_P_T · 4 months ago

Anthropic is by far the most annoying and self-righteous AI/LLM company. Despite stiff competition from OpenAI and Deepmind, it's not even close.

The most chill are Kimi and Deepseek, and incidentally also Facebook's AI group.

I wouldn't use any Anthropic product for free. I certainly wouldn't pay for it. There's nothing Claude does that others don't do just as well or better.

varispeed · 4 months ago

It's also why you wouldn't want to try to hack your own stuff. To see how robust are your defences and potentially discover angles you didn't consider.

charcircuit · 4 months ago

>such as developing ransomware, that would previously have required years of training.

Even ignoring that there are free open source ones you can copy. You literally just have to loop over files and conditionally encrypt them. Someone could build this on day 1 of learning how to program.

AI companies trying to police what you can use them for is a cancer on the industry and is incredibly annoying when you hit it. Hopefully laws can change to make it clear that model providers aren't responsible for the content they generate so companies can't blame legal uncertainty for it.