The new calculus of AI-based coding

As a security researcher, I am both salivating at the potential that the proliferation of TDD and other AI-centric "development" brings for me, and scared for IT at the same time.

Before we just had code that devs don't know how to build securely.

Now we'll have code that the devs don't even know what it's doing internally.

Someone found a critical RCE in your code? Good luck learning your own codebase starting now!

"Oh, but we'll just ask AI to write it again, and the code will (maybe) be different enough that the exact same vuln won't work anymore!" <- some person who is going to be updating their resume soon.

I'm going to repurpose the term, and start calling AI-coding "de-dev".

nosianu · 2 months ago

> Now we'll have code that the devs don't even know what it's doing internally.

I think that has already been true for some time for large projects continuously updated over a long time, and lots of developers entering and leaving the project throughout the years because nobody who has a choice wants to do that demoralizing job for long (I was one of them in the 1990s, the job was later given to an Indian H1B who could not switch to something better easily, not before putting in a few years of torture to have a better resume, and possibly a greencard).

Most famous post here, but I would like to see what e.g. Microsoft's devs would have to say, or Adobe's:

https://news.ycombinator.com/item?id=18442941

Such code has long been held together by the extensive test suites rather than intimate knowledge of how it all works.

The task of the individual developer is to close bug tickets and add features, not to produce an optimal solution, or even refactoring. They long ago gave up on that as taking too long.

Cthulhu_ · 2 months ago

That's the reality from software development at scale, pretty soon no individual will know how everything works and you need high-level architecture overviews on the one side, and strict procedures, standards, tools, test suites etc on the other hand to make sure things keep working.

But the reality is that most of us will never work in anything that big. I think the biggest thing i've worked in was in the 500K LOC range tops.

deaux · 2 months ago

First time seeing that post, oh my, I suggest everyone read it. And this is what half the world runs on.

Humorist2290 · 2 months ago

In my opinion, AI-coding is basically gambling. The odds of getting a usable output are way better than piping from /dev/urandom/, but ultimately it's still a probabilistic output of whether what you want is in fact what you get. Pay for some tokens, pull the slots, and hopefully your RCE goes away.

baq · 2 months ago

replace 'AI' with 'intern' for the literally same result.

phito · 2 months ago

> Now we'll have code that the devs don't even know what it's doing internally.

Haha, that already happens in almost any project after 2-3 years.

halfcat · 2 months ago

> that already happens in almost any project after 2-3 years.

Now with AI you’ll be able to not understand your code in only 2-3 days.

The next release will reduce the time to confusion to 2-3 hours.

Imagine a future where you’ll be able to generate a million lines of code per second, and not understand any of it.

zeroq · 2 months ago

Just few days ago I spoke with sec guy who was telling me how frustrating it is to validate AI code.

The problem is marketing.

Cycling industry is akin to audiophiles and will swear on their lives that $15,000 bicycle is the pinnacle of human engineering. This year's bike will go 11% faster than the previous model. But if you read last 10 years of marketing materials and do math it should basically ride itself.

There's so much money in AI right now that you can't really expect anyone to say "well, we had hopes, but it doesn't really work the way we expected". Instead you have pitch after pitch, masses parroting CEOs, and everyone wants to get a seat on the hype train.

It's easy to dispel audiophiles or carbon enthusiasts but it's not so easy with AI, because no one really knows how it works. OpenAI released a paper in which they stated, sorry for paraphrasing, "we did this, we did that, and we don't know why results were different".

andai · 2 months ago

>Now we'll have code that the devs don't even know what it's doing internally.

I am working on a legacy project. This is already the case!

skywhopper · 2 months ago

Is that a reason to start every project in the same state?

When Karpathy wrote Software 2.0 I was super excited.

I naively believed that we'll start building black boxes based on requirements, sets of inputs and outputs, and sudden changes of heart from stakeholders that often happen on a daily basis for many of us and mandates almost complete reimagination of project architecture will simply need another pass of training with new parameters.

Instead the mainstream is pushing hard reality where we mass produce a ton of code until it starts to work within guard rails.

  Does it really work? Is it maintainable?
  Get out of here. We're moving at 200mph.

rob_c · 2 months ago

How tf else did you honestly expect black-boxes to get built, by self-mangling machine code spit out by a sentient AI god?

Karpathy is bullish on everything bleeding edge, and unfortunately it kinda shows when you know the material better than he does. (source, I've been lecturing on all of it for a few years now). I'm not saying this is bad. It's great to see people who are engaging and bullish, it's better than most futurists waving their hands and going "something, something warp drive".

But when you take a step back and really ask what is going on behind the scenes, all we have is massive statistical tools performing neato tricks at statistical probability to predict patterns. There's no greater understanding or ability to learn or mimic. YET. The transformer for-instance can't easily learn complex mathematical operations. There's a google paper on "learning" multiplication and I know people working on building networks to "learn" sin/cos from scratch. But given these basic limitations and pretty much, every, single, paper, out of Apple "intelligence" crapping on the buzz. We've pretty much hit a limit beyond being the first company to allow for multi-trillion token parsing (or basic, limited, token parsing memory) for companies to capture and retrieve information.

swiftcoder · 2 months ago

> How tf else did you honestly expect black-boxes to get built, by self-mangling machine code spit out by a sentient AI god?

I'm not quite sure why everyone seems to want the AIs to be writing typescript - that's a language designed for human capabilities, with all the associated downsides.

Why not Prolog? APL? Something with richer primitives and tighter guardrails that is intrinsically hard for humans to wrangle with.

kaonwarb · 2 months ago

> unfortunately it kinda shows when you know the material better than he does. (source, I've been lecturing on all of it for a few years now)

That source is bearing a lot of weight.

adammarples · 2 months ago

Do you really think he knows that little? I mean fair enough you've been lecturing on it, but he was lecturing a decade ago, at Stanford. Then he took a little break to you know, run AI at Tesla...

jmull · 2 months ago

> Over the past three months... [we] have been building something really cool

The claim is a fast-moving, high performing team has become a 10x fast moving, high-performing team. That's equivalent to 2-1/2 years of development across a team.

Shall we expect the tangible results soon?

I'm perfectly willing to accept that AI coding will make us all a lot more productive, but I need to see the results.

dgemm · 2 months ago

> AI coding will make us all

I'm willing to believe it will make high-judgement autonomous people more productive, I'm less sure it will scale to everyone. The author is one of the senior-most technical staff at AWS.

ghm2180 · 2 months ago

And at that rate we should see a FAANGMULA like company launch in 10x or at least 5x less the time. Right?

oblio · 2 months ago

Programming was rarely the barrier to building these types of companies.

I know software people don't want to accept that, but it's almost always something on the business or administrative/management side of things.

Even for the programming bits, if your initial programmers suck (for some reason) but you have money, a great management team would just replace them with better programmers and fix the code mess with their help. So even that isn't a programming problem, it's a management problem.

And let's look at Twitter, who had atrocious code early on (fail whale galore), yet managed to make a profitable business due to amazing product market fit, despite management incompetence.

Companies just need to pass a code quality bar which is much, much, much lower than the bar programmers set.

nguyendat · 2 months ago

Could you help share little bit more detail about your team experimental result, e.g: new or legacy code base, how do you team measure/control quality. Thanks

ang_cire · 2 months ago

Zanfa · 2 months ago

IMO the biggest issue with AI code is that writing code is the easiest part of software development. Reviewing code is so much more difficult than writing it, even more so if you're not already intimately familiar with it in the first place.

It's like with AI images, where they look plausible at first, but then you start noticing all the little things that are off in the sidelines.

Dilettante_ · 2 months ago

  writing code is the easiest part of software development. Reviewing code is so much more difficult than writing it

A lot of people say this, and I do not doubt that it is fully true in their real experience. But it is not necessarily the only way for things to be.

If more time and effort were put into writing code which is easier to review, the difficulty of writing it would increase and the difficulty of reading it would decrease, flipping that equation. The incentives just aren't like that. It doesn't pay to maximize readability against time spent writing: Not every line will have to be reviewed, and not every line that has to be reviewed will be so complex that readability needs to be perfect to be maintainable.

It's not the code itself that makes review difficult. Even the best written code can be difficult to review. The complexity of effective code review arises from the fact that you need to understand the domain to evaluate correctness of both the code itself and the tests covering it.

The problem with AI is that those incentives wrong incentives are taken to 10000x.

And regarding "not every line will have to be reviewed, and not every line that has to be reviewed will be so complex that readability needs to be perfect to be maintainable.", the problem with AI is that code becomes basically unknowable.

Which is fine if everything that is built is slop, but many things aren't slop. Stuff that touches money, healthcare, personal relationships, etc you know, the things that matter in life, risks all turning into slop, which <will> have real life consequences.

We'll start seeing this in a few years.

Animats · 2 months ago

> Instead, we use an approach where a human and AI agent collaborate to produce the code changes. For our team, every commit has an engineer's name attached to it, and that engineer ultimately needs to review and stand behind the code. We use steering rules to setup constraints for how the AI agent should operate within our codebase,

This sounds a lot like Tesla's Fake Self Driving. It self drives right up to the crash, then the user is blamed.

groby_b · 2 months ago

Except here it's made abundantly clear, up front, who has responsibility. There's no pretense that it's fully self driving. And the engineer has the power to modify every bit of that decision.

Part of being a mature engineer is knowing when to use which tools, and accepting responsibility for your decisions.

It's not that different from collaborating with a junior engineer. This one can just churn out a lot more code, and has occasional flashes of brilliance, and occasional flashes of inanity.

> Except here it's made abundantly clear, up front, who has responsibility.

By the people who are disclaiming it, yes.

happyPersonR · 2 months ago

Idk it’s hard to say it’s called “Full Self Driving” and then the CEO says as much.

brazukadev · 2 months ago

But here's the critical part: the quality of what you are creating is way lower than you think, just like AI-written blog posts.

collingreen · 2 months ago

Upvoted for dig that is also an accurate and insightful metaphor.

skinnymuch · 2 months ago

Interesting enough to me though I only skimmed.

I switched back to Rails for my side project a month ago and ai coding when doing not too complex stuff has been great. While the old NextJS code base was in shambles.

Before I was still doing a good chunk of the NextJS coding. I’m probably going to be directly coding less than 10% of the code base from here on out. I’m now spending time trying to automate things as much as possible, make my workflow better, and see what things can be coded without me in the loop. The stuff I’m talking about is basic CRUD and scraping/crawling.

For serious coding, I’d think coding yourself and having ai as your pair programmer is still the way to go.

highfrequency · 2 months ago

> When your throughput increases by an order of magnitude, you're not just writing more code - you're making more decisions.

> These aren't just implementation details - they're architectural choices that ripple through the codebase.

> The gains are real - our team's 10x throughput increase isn't theoretical, it's measurable.

Enjoyed the article and the points it brought up. I do find it uncanny that this article about the merits and challenges of AI coding was likely written by ChatGPT.