Readit News logoReadit News
mrkeen · 5 months ago
This was a pre-existing problem, even if reliance on LLMs is making it worse.

Naur (https://gwern.net/doc/cs/algorithm/1985-naur.pdf) called it "theory building":

> The death of a program happens when the programmer team possessing its theory is dissolved. A dead program may continue to be used for execution in a computer and to produce useful results. The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered. Revival of a program is the rebuilding of its theory by a new programmer team.

Lamport calls it "programming ≠ coding", where programming is "what you want to achieve and how" and coding is telling the computer how to do it.

I strongly agree with all of this. Even if your dev team skipped any kind of theory-building or modelling phase, they'd still passively absorb some of the model while typing the code into the computer. I think that it's this last resort of incidental model building that the LLM replaces.

I suspect that there is a strong correlation between programmers who don't think that there needs to be a model/theory, and those who are reporting that LLMs are speeding them up.

bob1029 · 5 months ago
> I suspect that there is a strong correlation between programmers who don't think that there needs to be a model/theory, and those who are reporting that LLMs are speeding them up.

I have some anecdotal evidence that suggests that we can accomplish far more value-add on software projects when completely away from the computer and any related technology.

It's amazing how fast the code goes when you know exactly what you want. At this point the LLM can become very useful because its hallucinations instantly flag in your perspective. If you don't know what you want, I don't see how this works.

I really never understood the rationale of staring at the technological equivalent of a blank canvas for hours a day. The LLM might shake you loose and get you going in the right direction, but I find it much more likely to draw me into a wild goose chase.

The last 10/10 difficulty problem I solved probably happened in my kitchen while I was chopping some onions.

notpachet · 5 months ago
> It's amazing how fast the code goes when you know exactly what you want.

To quote Russ Ackoff[1]:

> Improving a system requires knowing what you could do if you could do whatever you wanted to. Because if you don't know what you would do if you could do whatever you wanted to, how on earth are you going to know what you can do under constraints?

[1] https://www.youtube.com/watch?v=OqEeIG8aPPk

hvb2 · 5 months ago
> The last 10/10 difficulty problem I solved probably happened in my kitchen while I was chopping some onions.

Let me guess, you had tears in your eyes when you found the solution?

rhetocj23 · 5 months ago
"I have some anecdotal evidence that suggests that we can accomplish far more value-add on software projects when completely away from the computer and any related technology.

It's amazing how fast the code goes when you know exactly what you want"

Yeah its the same reason why demand for pen and paper still exists. Its the absolute best way for one to think and get their thoughts out. I can personally attest to this - no digital whiteboard can ever compete with just a pen and paper. My best and original ideas come from a blank paper and a pen.

Solutions can emerge from anywhere. But its likely to happen when the mind is focused in a calm state - thats why walking for instance is great.

lelanthran · 5 months ago
> The last 10/10 difficulty problem I solved probably happened in my kitchen while I was chopping some onions.

Were you the one who developed TOR?

cantor_S_drug · 5 months ago
Another aspect is test cases constrains the domain of possible correct codes by a lot ( a randomly picked number will never solve a quadratic equation, by having many such quadratic equations [test cases] simulatenously, we are imposing lot of constraints on the solution space). Let's say I want LLM to write a regex but by having it run on test cases, I can gain confidence. This is the thesis of Simon Willison. Once LLMs continuously learn about what a code "means" in a tight REPL internal loop, it will start to gain better understanding.
tom_m · 5 months ago
100% - but this has always been true. Some people have always lived into the code before understanding. Now it's probably an even slippier slope.

It's like you don't know how to ski and you're going down a really steep hill...now with AI, imagine that really steep hill is iced over.

cozzyd · 5 months ago
The risk there is I chop my fingers too
dleeftink · 5 months ago
The illusive onion flow state!
jacobr1 · 5 months ago
I think this is part of the reason why I've had a bit more success with AI Coding than some of my colleagues. My pre-llm workflow was to rapidly build a crappy version of something so that I could better understand it, then rework it (even throw away to the prototype) to build something I now know how I want to handle. I've found even as plenty of thought leaders talk about this general approach (rapid prototyping, continuous refactoring, etc ...) that many engineers are resistant and want to think through the approach and then build it "right." Or alternatively just whip something out and don't throw it away, but rather toil on fixes to the their crappy first pass.

With AI this loop is much easier. It is cheap to even build 3 parallel implementations of something and maybe another where you let the system add whatever capability it thinks would be interesting. You can compare and use that to build much stronger "theory of the program" with requirements, where the separation of concerns are, how to integrate with the larger system. Then having AI build that, with close review of the output (which takes much less time if you know roughly what should be being built) works really well.

HarHarVeryFunny · 5 months ago
> My pre-llm workflow was to rapidly build a crappy version of something so that I could better understand it, then rework it (even throw away to the prototype)

That only works for certain type of simpler products (mostly one-man projects, things like web apps) - you're not going to be building a throw-away prototype, either by hand or using AI, of something more complex like your company's core operating systems, or an industrial control system.

RHSeeger · 5 months ago
> to rapidly build a crappy version of something so that I could better understand it, then rework it

I do this, too. And it makes me awful at generating "preliminary LOEs", because I can't tell how long something will take until I get in there and experiment a little.

nyrikki · 5 months ago
A formalized form of this is the red-green-refactor pattern common in TDD.

Self created or formalized methods work, but they have to have habits or practices in place that prevent disengagement and complacency.

With LLMs there is the problem with humans and automation bias, which effects almost all human endeavors.

Unfortunately that will become more problematic as tools improve, so make sure to stay engaged and skeptical, which is the only successful strategy I have found with support from fields like human factors research.

NASA and the FAA are good sources for information if you want to develop your own.

beder · 5 months ago
I agree, and even moreso, it's easy to see the (low!) cost of throwing away an implementation. I've had the AI coder produce something that works and technically meets the spec, but I don't like it for some reason and it's not really workable for me to massage it into a better style.

So I look up at the token usage, see that it cost 47 cents, and just `git reset --hard`, and try again with an improved prompt. If I had hand-written that code, it would have been much harder to do.

bluefirebrand · 5 months ago
> My pre-llm workflow was to rapidly build a crappy version of something so that I could better understand it, then rework it (even throw away to the prototype) to build something I now know how I want to handle.

In my experience this is a bad workflow. "Build it crappy and fast" is how you wind up with crappy code in production because your manager sees you have something working fast and thinks it is good enough

lioeters · 5 months ago
> theory building

That's insightful how you connected the "comprehension debt" of LLM-generated code with the idea of programming as theory building.

I think this goes deeper than the activity of programming, and applies in general to the process of thinking and understanding.

LLM-generated content - writing and visual art also - is equivalent to the code, it's what people see on the surface as the end result. But unless a person is engaged in the production, to build the theory of what it means and how it works, to go through the details and bring it all into a whole, there is only superficial understanding.

Even when LLMs evolve to become more sophisticated so that it can perform this "theory building" by itself, what use is such artificial understanding without a human being in the loop? Well, it could be very useful and valuable, but eventually people may start losing the skill of understanding when it's more convenient to let the machine do the thinking.

tsunamifury · 5 months ago
What if the LLM can just understand the theory or read the code and derive it?
wiremine · 5 months ago
> I suspect that there is a strong correlation between programmers who don't think that there needs to be a model/theory, and those who are reporting that LLMs are speeding them up.

I also strongly agree with Lamport, but I'm curious why you don't think Ai can help in the "theory building" process, both for the original team, and a team taking over a project? I.e., understanding a code base, the algorithms, etc.? I agree this doesn't replace all the knowledge, but it can bridge a gap.

wholinator2 · 5 months ago
I agree, the llm _vastly_ speeds up the process of "rebuilding the theory" of dead code, even faster than the person who wrote it 3 years ago can. I've had to work on old fortran codebases before and recently had the pleasure of including ai in my method and my god, it's so much easier! I can just copy and paste every relevant function into a single prompt, say "explain this to me" and it will not only comment each line with its details, but also elucidate the deeper meaning behind the set of actions. It can tell exactly which kind of theoretical simulation the code is performing without any kind of prompting on my part, even when the functions are named things like "a" or "sv3d2". Then, i can request derivations and explanations of all relevant theory to connect to the code and come away in about 1 days worth of work with a pretty good idea of the complete execution of a couple thousand lines of detailed mathematical simulations in a languages I'm no expert in. LLMs contribution to building theory has actually been more useful to me than is contribution in writing code!
bunderbunder · 5 months ago
From what I've seen they're great at identifying trees and bad at mapping the forest.

In other words, they can help you identify what fairly isolated pieces of code are doing. That's helpful, but it's also the single easiest part of understanding legacy code. The real challenges are things like identifying and mapping out any instances of temporal coupling, understanding implicit business rules, and inferring undocumented contracts and invariants. And LLM coding assistants are still pretty shit at those tasks.

prmph · 5 months ago
Indeed, I once worked with a developer on a contract team who was only concerned with runtime execution, no concern whatever for architecture or code clarity or whatever at all.

The client loved him, for obvious reasons, but it's hard to wrap my head around such an approach to software construction.

Another time, I almost took on a gig, but when I took one look at the code I was supposed to take over, I bailed. Probably a decade would still not be sufficient for untangling and cleaning up the code.

True vibe coding is the worst thing. It may be suitable for one-ff shell script s of < 100 line utilities and such, anything more than that and you are simple asking for trouble

N70Phone · 5 months ago
> I.e., understanding a code base, the algorithms, etc.?

The big problem is that LLMs do not *understand* the code you tell them to "explain". They just take probabilistic guesses about both function and design.

Even if "that's how humans do it too", this is only the first part of building an understanding of the code. You still need to verify the guess.

There's a few limitations using LLMs for such first-guessing: In humans, the built up understanding feeds back into the guessing, as you understand the codebase more, you can intuit function and design better. You start to know patterns and conventions. The LLM will always guess from zero understanding, relying only on the averaged out training data.

A following effect is that which bunderbunder points out in their reply: while LLMs are good at identifying algorithms, mere pattern recognition, they are exceptionally bad at world-modelling the surrounding environment the program was written in and the high level goals it was meant to accomplish. Especially for any information obtained outside the code. A human can run a git-blame and ask what team the original author was on, an LLM cannot and will not.

This makes them less useful for the task. Especially in any case where you intent to write new code; Sure, it's great that the LLM can give basic explanations about a programming language or framework you don't know, but if you're going to be writing code in it, you'd be better off taking the opportunity to learn it.

netghost · 5 months ago
Perhaps it's the difference between watching a video of someone cooking a meal and cooking it for yourself.
jquaint · 5 months ago
I agree with this sentiment. Perhaps this is why there is such a senior / junior divide with LLM use. Seniors already build their theories. Juniors don't have that skill.
BobbyTables2 · 5 months ago
Fully agree.

I was once on a project where all the original developers suddenly disappeared and it was taken over by a new team. All institutional knowledge had been lost.

We spent a ridiculous amount of time trying to figure out the original design. Introduced quite a few bugs until it was better understood. But also fixed a lot of design issues after a much head bashing.

By the end, it had been mostly rewritten and extended to do things not originally planned.

But the process was painful.

leptons · 5 months ago
I once took over a project that was built by someone in Mexico and all the function names and variables were kind of obscure Mexican slang words, and I don't know any Spanish. That was probably the most frustrating project I've ever worked on.
the_af · 5 months ago
> The death of a program happens when the programmer team possessing its theory is dissolved. A dead program may continue to be used for execution in a computer and to produce useful results. The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered. Revival of a program is the rebuilding of its theory by a new programmer team.

I really like this definition of "life" and "death" of programs, quite elegant!

I've noticed that I struggle the most when I'm not sure what the program is supposed to do; if I understand this, the details of how it does it become more tractable.

The worry is that LLMs make it easier to just write and modify code without truly "reviving" the program... And even worse, they can create programs that are born dead.

kossTKR · 5 months ago
While interesting this is not the point of the article.

Point is LLM's makes this problem 1000 times worse and so it really is a ticking time bomb thats totally new - most people, most programmers, most day to day work will not include some head in the clouds abstract metaprogramming but now LLM's both force programmers to "do more" and constantly destroys anyones flow state, memory, and the 99% of the talent and skill that comes from actually writing good code for hours a day.

LLM's are amazing but they also totally suck because they essentially steal learning potential, focus and increase work pressure and complexity, and this really is new, because also senior programmers are affected by this, and you really will feel this at some point after using these systems for a while.

They make you kind of demented, and no you can't fight this with personal development and forced book reading after getting up at 4 am in the morning just as with scrolling and the decrease in everyones focus, even bibliophiles.

lxgr · 5 months ago
I'd actually argue that developers being actually sped up by LLMs (i.e. in terms of increasing their output of maintainable artifacts and not just lines of code) are those that have a good theory of the system they're working on.

At least at this point, LLMs are great at the "how", but are often missing context for the "what" and "why" (whether that's because it's often not written down or not as prevalent in their training data).

827a · 5 months ago
I've used the word "coherence" to describe this state; when an individual or a team has adequately groked the system and its historical context to achieve a level of productivity in maintenance and extension, only then is the system coherent.

Additionally and IMO critically to this discussion: Its easy for products or features to "die" not when the engineers associated with it lose coherence on how it is implemented from a technical perspective, but also when the product people associated with it lose coherence on why it exists or who it exists for. The product can die even if one party (e.g. engineers) still maintains coherence while the other party (e.g. product/business) does not. At this point you've hit a state where the system cannot be maintained or worked on because everyone is too afraid of breaking an existing workflow.

LLMs are, like, barely 3% of the way toward solving the hardest problems I and my coworkers deal with day-to-day. But the bigger problem is that I don't yet know which 3% it is. Actually, the biggest problem is maybe that its a different, dynamic 3% of every new problem.

w10-1 · 5 months ago
In my observation, this "coherence" is a matter not only of understanding, but of accepting, particularly certain trade-off's. Often this acceptance is because people don't want to upset the person who insisted on it.

Once they're gone or no longer applying pressure, the strain is relieved, and we can shift to a more natural operation, application, or programming model.

For this reason, it helps to set expectations that people are cycled through teams at slow intervals - stable enough to build rapport, expertise, and goodwill, but transient enough to avoid stalls based on shared assumptions.

danmaz74 · 5 months ago
Having worked on quite a few legacy applications in my career, I would say that, as for so many other issues in programming, the most important solution to this issue is good modularization of your code. That allows a new team to understand the application at high level in terms of modules interacting with each other, and when you need to make some changes, you only need to understand the modules involved, and ideally one at a time. So you don't need to form a detailed theory of the whole application all at the same time.

What I'm finding with LLMs is that, if you follow good modularization principles and practices, then LLMs actually make it easier to start working on a codebase you don't know very well yet, because they can help you a lot in navigating, understanding "as much as you need", and do specific changes. But that's not something that LLMs do on their own, at least from my own experience - you still need a human to enforce good, consistent modularization.

FrustratedMonky · 5 months ago
Debt has always existed, and "LLMs is making it worse"

Yes, I think point is, LLM's are making it 'a-lot' worse.

And then compounding that will be in 10 years when no Senior Devs were being created, so nobody will be around to fix it. Extreme of course, there will be dev's, they'll just be under-water, piled on with trying to debug the LLM stuff.

pixl97 · 5 months ago
>they'll just be under-water, piled on with trying to debug the LLM stuff.

So in that theory the senior devs of those days will still be able to command large salaries if they know their stuff, in specific how to untangle the mess of LLM code.

rapind · 5 months ago
I could also argue that 20 years ago EJBs made it a lot worse, ORMs made it massively worse, heck Rails made it worse, and don't even get me started on Javascript frameworks, which are the epitome of dead programs and technical debt. I guarantee there were assembly programmers shouting about Visual Basic back in the day. These are all just abstractions, as is AI IMO, and some are worse than others.

If and when technical debt becomes a paralyzing problem, we'll come up with solutions. Probably agents with far better refactoring skills than we currently have (most are kind of bad at refactoring right now). What's crazy to me is how tolerant the consumer has become. We barely even blink when a program crashes. A successful AAA game these days is one that only crashes every couple hours.

I could show you a Java project from 20+ years ago and you'd have no idea wtf is going on, let alone why every object has 6 interfaces. Hey, why write SQL (a declarative, somewhat functional language, which you'd think would be in fashion today!), when you could instead write reams of Hibernate XML?! We've set the bar pretty low for AI slop.

tom_m · 5 months ago
I love how AI is surfacing problems that have been present all along. People are beginning to spend more time thinking about what's actually important when building a software product.

My hope is that people keep the dialogue going because you may be right about the feeling of LLMs speeding things up. It could likely be because people are not going through the proper processes including planning and review. That will create mountains of future work; bugs, tech debt, and simply learning. All of which still could benefit from AI tools of course. AI is a very helpful tool, but it does require responsibility.

jrochkind1 · 5 months ago
Yep. I think the industry discounted the need for domain-specific and code-specific knowledge, the value of having developer stick around. And of having developers spend time on "theory building" and sharing.

You can't just replace your whole coding team and think you can proceed at the same development pace. Even if the code is relatively good and the new developers relatively skilled. Especially if you lack "architecture model" level docs.

But yeah LLM's push it to like an absurd level. What if all your coders were autistic savant toddlers who get changed out for a new team of toddlers every month.

loudmax · 5 months ago
LLMs can be used to vibe-code with limited or superficial understanding, but they can also be extremely helpful parsing and explaining code for a programmer who wants to understand what the program is doing. Well-managed forward thinking organizations will take advantage of the latter approach. The overwhelming majority of organizations will slide into the former without realizing it.

In the medium to longer term, we might be in a situation where only the most powerful next-generation AI models are able to make sense of giant vibe-coded balls of spaghetti and mud we're about to saddle ourselves with.

HPsquared · 5 months ago
Kind of like a dead (natural) language.
diob · 5 months ago
Funny enough I find LLMs useful for fixing the "death of a program" issue. I was consulting on a project built offshore where all the knowledge / context was gone, and it basically allowed me to have an AI version of the previous team that I could ask questions of.

I could ask questions about how things were done, have it theorize about why, etc.

Obviously it's not perfect, but that's fine, humans aren't perfect either.

pjc50 · 5 months ago
Ah, an undead programmer. Reminds me of "Dixie Flatline" from Neuromancer (1984), a simulation of a famous dead hacker trapped in a cartridge to help the protagonist.
BenoitP · 5 months ago
> "theory building"

Strongly agree with your comment. I wonder now if this "theory building" can have a grammar, and be expressed in code; be versioned, etc. Sort of like a 5th-generation language (the 4th-generation being the SQL-likes where you let the execution plan be chosen by the runtime).

The closest I can think of:

* UML

* Functional analysis (ie structured text about various stakeholders)

* Database schemas

* Diagrams

CaptainOfCoit · 5 months ago
Prolog/Datalog with some nice primitives for how to interact with the program in various ways? Would essentially be something like "acceptance tests" but expressed in some logic programming language.
adw · 5 months ago
LLMs speed you up more if you have an appropriate theory in greenfield tasks (and if you do the work of writing your scaffold yourself).

Brownfield tasks are harder for the LLM at least in part because it’s harder to retroactively explain regular structure in a way the LLM understands and can serialize into eg CLAUDE.md.

zitterbewegung · 5 months ago
You could add part of your workflow to explain what it did. I've also asked it to check its own output and it even fixes itself (sort of counter intuitive) but it makes more sense if you think about someone just asking to fix their own code.
ants_everywhere · 5 months ago
> programming is "what you want to achieve and how"

As in linear programming or dynamic programming.

> I suspect that there is a strong correlation between programmers who don't think that there needs to be a model/theory, and those who are reporting that LLMs are speeding them up.

This is an interesting prediction. I think you'll get a correlation regardless of the underlying cause because most programmers don't think there needs to be a model/theory and most programmers report LLMs speeding them up.

But if you control for that, there are also some reasons you might expect the opposite to be true. It could be that programmers who feel the least sped up by LLMs are the ones who feel their primary contributing is in writing code rather than having the correct model. And people who view their job as finding the right model are more sped up because the busy work of getting the code in the right order is taken off their plate.

DrNosferatu · 5 months ago
Then separate model from code, and leverage LLMs to that effect.
rusk · 5 months ago
> programmers who don't think that there needs to be a model/theory

Ah rationalism vs empiricism again

Kant up in heaven laughing his ass off

bodhi_mind · 5 months ago
At least llm will delete code it replaces instead of commenting out every piece of old functionality.
matt_heimer · 5 months ago
LLMs have made it better for us. The quality of code committed by the junior developers has improved.
amelius · 5 months ago
> The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered.

Yeah but we can ask an LLM to read the code and write documentation, if that happens.

WJW · 5 months ago
Good documentation also contains the "why" of the code, ie why it is the way it is and not one of the other possible ways to write the same code. That is information inherently not present in the code, and there would be no way for a LLM to figure it out after the fact.

Also, no "small" program is ever at risk of dying in the sense that Naur describes it. Worst case, you can re-read the code. The problem lies with the giant enterprise code bases of the 60s and 70s where thousands of people have worked on it over the years. Even if you did have good documentation, it would be hundreds of pages and reading it might be more work than just reading the code.

Marazan · 5 months ago
I'm currently involved in a project where we are getting the LLM to do exactly that. As someone who _does_ have a working theory of the software (involved in designing and writing it) my current assessment is that the LLM generated docs are pure line noise at the moment and basically have no value in imparting knowledge.

Hopefully we can iterate and get the system producing useful documents automagically but my worry is that it will not generalise across different system and as a result we will have invested a huge amount of effort into creating "AI" generated docs for our system that could have been better spent just having humans write the docs.

sfn42 · 5 months ago
It's insane to me how you people are so confident in the LLMs abilities. Have you not tried them? They fuck things up all the time. Basic things. You can't trust them to do anything right.

But sure let's just have it generate docs, that's gonna work great.

ljm · 5 months ago
The problem will always remain that it cannot answer 'why', only 'what'. And oftentimes you need things like intent and purpose and not just a lossy translation from programming instructions to prose.

I'd see it like transcribing a piece of music where an LLM, or an uninformed human, would write down "this is a sequence of notes that follow a repetitive pattern across multiple distinct blocks. The first block has the lyrics X, Y ...", but a human would say "this is a pop song about Z, you might listen to it when you're feeling upset."

ModernMech · 5 months ago
It would be nice if LLMs could do that without being wrong about what the code does and doesn't do.
meindnoch · 5 months ago
Magical thinking.
mixedbit · 5 months ago
My experience is that LLM too often finds solutions that work, but are way more complex than necessary. It is easiest to recognize and remove such complexity when the code is originally created, because at this time the author should have the best understanding of the problem being solved, but this requires extra time and effort. Once the overly complex code is committed, it is much harder to recognize the complexity is not needed. Readers/maintainers of code usually assume that the existing code solves real world problem, they do not have enough context to recognize that much simpler solution could work as well.
jf22 · 5 months ago
It's easy to avoid overly complex solutions with LLMs.

First, your prompts should be direct enough to the LLM doesn't wander around producing complexity for no reason.

Second, you should add rules/learning/context to always solve problems in the simplest way possible.

Lastly, after generation, you can prompt the LLM to reduce the complexity of the solution.

justsocrateasin · 5 months ago
Okay how about this situation that one of my junior devs hit recently:

Coding in an obj oriented language in an enormous code base (big tech). Junior dev is making a new class and they start it off with LLM generation. LLM adds in three separate abstract classes to the inheritance structure, for a total of seven inherited classes. Each of these inherited classes ultimately comes with several required classes that are trivial to add but end up requiring another hundred lines of code, mostly boilerplate.

Tell me how you, without knowing the code base, get the LLM to not add these classes? Our language model is already trained on our code base, and it just so happens that these are the most common classes a new class tends to inherit. Junior dev doesn't know that the classes should only be used in specific instances.

Sure, you could go line by line and say "what does this inherited class do, do I need it?" and actually, the dev did that. It cut down the inherited classes from three to two, but missed two of them because it didn't understand on a product side why they weren't needed.

Fast forward a year, these abstract classes are still inherited, no one knows why or how because there's no comprehension but we want to refactor the model.

Dead Comment

trjordan · 5 months ago
LLMs absolutely produce reams of hard-to-debug code. It's a real problem.

But "Teams that care about quality will take the time to review and understand LLM-generated code" is already failing. Sounds nice to say, but you can't review code being generated faster than you can read it. You either become a bottleneck (defeats the point) or you rubber-stamp it (creates the debt). Pick your poison.

Everyone's trying to bolt review processes onto this. That's the wrong layer. That's how you'd coach a junior dev, who learns. AI doesn't learn. You'll be arguing about the same 7 issues forever.

These things are context-hungry but most people give them nothing. "Write a function that fixes my problem" doesn't work, surprise surprise.

We need different primitives. Not "read everything the LLM wrote very carefully" ways to feed it the why, the motivation, the discussion and prior art. Otherwise yeah, we're building a mountain of code nobody understands.

mattlondon · 5 months ago
We use the various instruction .md files for the agents and update them with common issues and pitfalls to avoid, as well as pointers to the coding standards doc.

Gemini and Claude at least seem to work well with it, but sometimes still make mistakes (e.g. not using c++ auto is a recurrent thing, even though the context markdown file clearly states not to). I think as the models improve and get better at instruction handling it will get better.

Not saying this is "the solution" but it gets some of the way.

I think we need to move away from "vibe coding", to more caring about the general structure and interaction of units of code ourselves, and leave the AI to just handle filling in the raw syntax and typing the characters for us. This is still a HUGE productivity uplift, but as an engineer you are still calling the shots on a function by function, unit by unit level of detail. Feels like a happy medium.

int_19h · 5 months ago
It does rather invite the question of whether the most popular programming languages today are conductive to "more caring about the general structure and interaction of units of code" in the first place. Intuitively it feels that something more like say Ada SPARK, with its explicit module interfaces and features like design by contract would be better suited to this.

Same thing with syntax - so far we've been optimizing for humans, and humans work best at a certain level of terseness and context-dependent implicitness (when things get too verbose, it's visually difficult to parse), even at the cost of some ambiguity. But for LLMs verbosity can well be a good thing to keep the model grounded, so perhaps stuff like e.g. type inference, even for locals, is a misfeature in this context. In fact, I wonder if we'd get better results if we forced the models to e.g. spell out the type of each expression in full, maybe even outright stuff like method chains and require each call result to be bound to some variable (thus forcing LM to give it a name, effectively making a note on what it thinks it's doing).

Literate programming also feels like it should fit in here somewhere...

So, basically, a language that would be optimized specifically for LLMs to write, and for humans to read and correct.

Going beyond the language itself, there's also a question of ecosystem stability. Things that work today should continue to work tomorrow. This includes not just the language, but all the popular libraries.

And what are we doing instead? We're having them write Python and JavaScript, of all things. One language famous for its extreme dynamism, with a poorly bolted on static type system; another also like that, but also notorious for its footguns and package churn.

trjordan · 5 months ago
100% agree. If you care about API design, data flow, and data storage schemas, you're already halfway there.

I think there's more juice to squeeze there. A lot of what we're going to learn is how to pick the right altitude of engagement with AI, I think.

sbene970 · 5 months ago
> even though the context markdown file clearly states not to

You might know this, but telling the LLM what to do instead of what not to do generally works better, or so I heard.

Herring · 5 months ago
> You … become a bottleneck (defeats the point)

It's better if the bottleneck is just reviewing, instead of both coding and reviewing, right?

We've developed plenty of tools for this (linting, fuzzing, testing, etc). I think what's going on is people who are bad at architecting entire projects and quickly reading/analyzing code are having to get much better at that and they're complaining. I personally enjoy that kind of work. They'll adapt, it's not that hard.

trjordan · 5 months ago
There's plenty of changes that don't require deep review, though. If you're written a script that's, say, a couple fancy find/replaces, you probably don't need to review every usage. Check 10 of 500, make sure it passes lint/tests/typecheck, and it's likely fine.

The problem is that LLM-driven changes require this adversarial review on every line, because you don't know the intent. Human changes have a coherence to them that speeds up review.

(And you your company culture is line-by-line review of every PR, regardless of complexity ... congratulations, I think? But that's wildly out of the norm.)

wtetzner · 5 months ago
> It's better if the bottleneck is just reviewing, instead of both coding and reviewing, right?

Not really. There's something very "generic" about LLM generated code that makes you just want gloss over it, no matter how hard you try not to.

acedTrex · 5 months ago
The bottleneck has never been coding lol
ModernMech · 5 months ago
Yes, "just take the time to review and understand LLM-generated code" is the new "just don't write bad code and you won't have any bugs". As an industry, we all know from years of writing bugs despite not wanting to that this task is impossible at scale. Just reviewing all the AI code to make sure it is good code likewise does not scale in the same way. Will not work, and it will take 5-10 years for the industry to figure it out.
shinecantbeseen · 5 months ago
I've had some (anecdotal) success reframing how I think about my prompts and the context I give the LLM. Once I started thinking about it as reducing the probability space of output through priming via context+prompting I feel like my intuition for it has built up. It also becomes a good way to inject the "theory" of the program in a re-usable way.

It still takes a lot of thought and effort up front to put that together and I'm not quite sure where the breakover line between easier to do-it-myself and hand-off-to-llm is.

solatic · 5 months ago
> We need different primitives

The correct primitives are the tests. Ensure your model is writing tests as you go, and make sure you review the tests, which should be pretty readable. Don't merge until both old and new tests pass. Invest in your test infrastructure so that your test suite doesn't get too slow, as it will be in the hot path of your model checking future work.

Legacy code is that which lacks tests. Still true in the LLM age.

raincole · 5 months ago
> You either become a bottleneck (defeats the point)

How...?

When I found code snippets from StakcOverflow, I read them before pasting them into my IDE. I'm the bottleneck. Therefore there is no point to use StackOverflow...?

alexpotato · 5 months ago
Was listening to the Dwarkesh Patel podcast recently and the guest (Agustin Lebron) [0] mentioned the book "A Deepness In The Sky" by Vernor Vinge [1].

I started reading it and a key plot point is that there is a computer system that is thousands of years old. One of the main characters has "cold sleeped" for so long that he's the only one who knows some of the hidden backdoors. That legacy knowledge is then used to great effect.

Highly recommend it for a great fictional use of institutional knowledge on a legacy codebase (and a great story overall).

0 - https://www.youtube.com/watch?v=3BBNG0TlVwM

1 - https://amzn.to/42Fki8n

mock-possum · 5 months ago
His description of learning to be a programmer in that far future era was fun too, iirc there was just so much ‘legacy code’, like practically infinite libraries and packages to perform practically any function - that ‘coding’ was mostly a matter of finding the right pieces and wiring them together. Knowing the nuances of these existing pieces and the subtlety of their interpretation was the skill.
alexpotato · 5 months ago
100%

Another great example:

In Fire Upon the Deep, due to the delay in communications between star systems, everyone use a descendant of Usenet.

AlwaysRock · 5 months ago
RIP Vernor Vinge. Somehow, his ideas seem more and more relevant.
alexpotato · 5 months ago
Especially since he coined the term "technological singularity"

https://en.wikipedia.org/wiki/Technological_singularity

blackhaj7 · 5 months ago
Sounds great - thanks for the recommendation.

Looks like it is the second in a trilogy. Can you just dive in or did you read the first book before?

int_19h · 5 months ago
The first two books can be treated largely as standalone works. They do technically take place in the same broad universe, but said universe is basically divided into FTL and non-FTL zones with vastly different societies in each (for obvious reasons), and the non-FTL societies aren't even aware of this boundary. "Fire upon the Deep" is set mostly in the FTL zone, with the boundary itself being a major plot point. "Deepness in the Sky" is set entirely in the non-FTL zone, and the lack of FTL is a major plot point there.

Chronologically, DitS takes place before FotD. But there is exactly one character in common between the two books, and while he is a major character in both, none of the events of DitS are relevant to the story in FotD (which makes sense since FotD was written first).

So it's really largely a matter of preference as to which one to read first. I would say that FotD has more action and, for the lack of better term, "weirdness" in the setting; while DitS is more slow-paced, with more character development and generally more fleshed-out characters, and explores its themes deeper. But both books have plenty for your mind to chew on.

All in all I think FotD is an easier read, and DitS is a more rewarding one, but this is all very subjective.

One upside to the books being decoupled as much as they are is that whichever one you start with, you get a complete story, so even if you're a completionist you can disregard the other book if you don't like the first one.

octoberfranklin · 5 months ago
You can start with the second, but the first book is better at grabbing the attention of a new reader with wild ideas (broadband audio hive-minds, variable speed-of-light). If you make it through the first chapter you won't be able to put it down.

The second book is just as good, but doesn't try as hard to get you addicted early on. The assumption is that you already know how good Vinge's work is.

I recommend starting with Fire Upon the Deep.

duskwuff · 5 months ago
A Fire upon the Deep and A Deepness in the Sky are loosely connected; you can read them in either order. Both novels reveal some details which explain bits of the other.

However, I would recommend skipping Children of the Sky. It's not as good, and was clearly intended as the first installment of a series which Vinge was unable to complete. :(

alexpotato · 5 months ago
I read Fire Upon the Deep first and liked both books.

General recommendation is to read them in order (Fire first, Deepness second) but I don't really think it matters.

donatj · 5 months ago
A friend was recently telling me about an LLM'd PR he was reviewing submitted by a largely non-technical manager where the feature from the outside entirely appeared to work, but actually investigating the thousands of lines of generated code, it was instead hacking their response cache system to appear to work without actually updating anything on the backend.

It took a ton of effort on his part to convince his manager that this wasn't ready to be merged.

I wonder how much vibe coded software is out there in the wild that just appears to work?

rAum · 5 months ago
lol you should absolutely merge it and go with it in such cases, just collect evidence first to have enough deniability and enjoy the show. You can tell a child not to do the thing over and over or just accept it will very quickly learn for their life that touching hot oven is not a smart thing to do. With so much AI hype induced brainrot seems for certain individuals the only antitode is to make them feel direct consequences of their false beliefs. Without feedback loop there is no learning occurring at all.

More dangerous thing is such idiot managers can judge you by their lens of shipping LLM garbage they didn't applied in reality to see consequences, living in fantasy due to lack of technical knowledge. Of course it directly leads to firing people and adding more tasks/balloning expectation on leftover team who are force trapped to burn out and be replaced as trash as that makes total sense in their world view and "evidence".

ModernMech · 5 months ago
That's only an option when it's not you who will have to clean up the mess.
iamleppert · 5 months ago
Have you tried Loveable or seen any of their marketing? They are innovating a new category of software that is passable in all the ways a typical user can examine, but none of the ways of traditional software.

And why should they? Most people will pay them, churn out whatever code, it will likely never be deployed or used by anyone (this is true of most code created by a real engineer too). By the time the user has figured out what they have "created" isn't real, Loveable is on to the next mark/user.

OutOfHere · 5 months ago
I would report the manager to the CTO or CEO or business owners/investors.
int_19h · 5 months ago
You mean, the very people who keep doubling down on investments into AI combined with layoffs? You'd go and tell them that this thing that they signed off on, pitched to others, and thus are ultimately responsible for if it fails in a way that cannot be denied or covered up, is not working.
ebiester · 5 months ago
Where are these non-technical engineering managers and how did they stay in the business?

I haven't seen a truly non-technical manager in over 15 years.

low_tech_punk · 5 months ago
Most programmers don't understand the low level assembly or machine code. High level language becomes the layer where human comprehension and collaboration happens.

LLM is pushing that layer towards natural language and spec-driven development. The only *big* difference is that high level programming languages are still deterministic but natural language is not.

I'm guessing we've reached an irreducible point where the amount of information needed specify the behavior of a program is nearly optimally represented in programming languages after decades of evolution. More abstraction into the natural language realm would make it lossy. And less abstraction down to the low level code would make it verbose.

adamddev1 · 5 months ago
The difference is not just a jump to a higher abstraction with natural language. It's something fundamentally differet.

The previous tools (assemblers, compilers, frameworks) were built on hard-coded logic that can be checked and even mathematically verified. So you could trust what you're standing on. But with LLMs we jump off the safely-built tower into a world of uncertainty, guesses, and hallucinations.

mym1990 · 5 months ago
If LLMs still produce code that is eventually compiled down to a very low level...that would mean it can be checked and verified, the process just has additional steps.

JavaScript has a ton of behavior that is very uncertain at times and I'm sure many JS developers would agree that trusting what you're standing on is at times difficult. There is also a large percentage of developers that don't mathematically verify their code, so the verification is kind of moot in those cases, hence bugs.

The current world of LLM code generation lacks the verification you are looking for, however I am guessing that these tools will soon emerge in the market. For now, building as incrementally as possible and having good tests seems to be a decent path forward.

austin-cheney · 5 months ago
> Most programmers don't understand the low level assembly or machine code.

Most programmers that write JavaScript for a living don't really understand how to scale applications in JavaScript, which includes data structures in JavaScript. There is a very real dependence on layers of abstractions to enable features that can scale. They don't understand the primary API to the browser, the DOM, at all and many don't understand the Node API outside the browser.

For an outside observer it really begs the Office Space question: What would you say you do here? Its weird trying to explain it to people completely outside software. For the rest of us in software we are so used to this we take the insanity for granted as an inescapable reality.

Ironically, at least in the terms of your comment, is that when you confront JavaScript developers about this lack of fundamental knowledge comparisons to assembly frequently come up. As though writing JavaScript directly is somehow equivalent to writing machine code, but for many people in that line of work they are equivalent distant realities.

The introduction of LLMs makes complete sense. When nobody knows how any of this code works then there isn't a harm to letting a machine write it for you, because there isn't a difference in the underlying awareness.

rmunn · 5 months ago
> Most programmers that write JavaScript for a living don't really understand how to scale applications in JavaScript, which includes data structures in JavaScript. There is a very real dependence on layers of abstractions to enable features that can scale.

Although I'm sure you are correct, I would also want to mention that most programmers that write JavaScript for a living aren't working for Meta or Alphabet or other companies that need to scale to billions, or even millions, of users. Most people writing JavaScript code are, realistically, going to have fewer than ten thousand users for their apps. Either because those apps are for internal use at their company (such as my current project, where at most the app is going to be used by 200-250 people, so although I do understand data structures I'm allowing myself to do O(N^2) business logic if it simplifies the code, because at most I need to handle 5-6 requests per minute), or else because their apps are never going to take off and get the millions of hits that they're hoping for.

If you don't need to scale, optimizing for programmer convenience is actually a good bet early on, as it tends to reduce the number of bugs. Scaling can be done later. Now, I don't mean that you should never even consider scaling: design your architecture so that it doesn't completely prevent you from scaling later on, for example. But thinking about scale should be done second. Fix bugs first, scale once you know you need to. Because a lot of the time, You Ain't Gonna Need It.

foo42 · 5 months ago
A side effect of the non-deterministic behaviour is that, unlike previous increases in abstraction, the high level prompts are not checked in to the code base and available to recreate their low level output on demand. Instead we commit the lower level output (ie code) and future revisions must operate on this output without the ability to modify the original high level instructions.
the_duke · 5 months ago
I feel like natural language specs can play a role, but there should be an intermediate description layer with strict semantics.

Case in point: I'm seeing much more success in LLM driven coding with Rust, because the strong type system prevents many invalid states that can occur in more loosely or untyped languages.

It takes longer, and often the LLM has to iterate through `cargo check` cycles to get to a state that compiles, but once it does the changes are very often correct.

The Rust community has the saying "if it compiles, it probably works". You can still have plenty of logic bugs of course , but the domain of possible mistakes is smaller.

What would be ideal is a very strict (logical) definition of application semantics that LLMs have to implement, and that ideally can be checked against the implementation. As in: have a very strict programming language with dependent types , littered with pre/post conditions, etc.

LLMs can still help to transform natural language descriptions into a formal specification, but that specification should be what drives the implementation.

redsymbol · 5 months ago
There is another big difference: natural languages have ambiguity baked in. If a programming language has any ambiguity in how it can be parsed, that is rightly considered a major bug. But it's almost a feature of natural languages, allowing poetry, innuendo, and other nuanced forms of communication.
int_19h · 5 months ago
There are constructed languages that preserve the expressivity of natural human languages but without the implicit ambiguity, though; most notably, Loglan and its successor Lojban. If you read Golden Age sci-fi, Loglan sometimes shows up there specifically in this role - e.g. "Moon is a Harsh Mistress":

> By then Mike had voder-vocoder circuits supplementing his read-outs, print-outs, and decision-action boxes, and could understand not only classic programming but also Loglan and English, and could accept other languages and was doing technical translating—and reading endlessly. But in giving him instructions was safer to use Loglan. If you spoke English, results might be whimsical; multi-valued nature of English gave option circuits too much leeway.

For those unfamiliar with it, it's not that Lojban is perfectly unambiguous. It's that its design strives to ensure that ambiguity is always deliberate by making it explicit.

The obvious problem with all this is that Lojban is a very niche language with a fairly small corpus, so training AI on it is a challenge (although it's interesting to note that existing SOTA models can read and write it even so, better than many obscure human languages). However, Lojban has the nice property of being fully machine parseable - it has a PEG grammar. And, once you parse it, you can use dictionaries to construct a semantic tree of any Lojban snippet.

When it comes to LLMs, this property can be used in two ways. First, you can use structured output driven by the grammar to constrain the model to output only syntactically valid Lojban at any point. Second, you can parse the fully constructed text once it has been generated, add semantic annotations, and feed the tree back into the model to have it double-check that what it ended up writing means exactly what it wanted to mean.

With SOTA models, in fact, you don't even need the structured output - you can just give them parser as a tool and have them iterate. I did that with Claude and had it produce Lojban translations that, while not perfect, were very good. So I think that it might be possible, in principle, to generate Lojban training data out of other languages, and I can't help but wonder what would happen if you trained a model primarily on that; I suspect it would reduce hallucinations and generally improve metrics, but this is just a gut feel. Unfortunately this is a hypothesis that requires a lot of $$$ to properly test...

low_tech_punk · 5 months ago
I had a similar thought, feature not bug.

The nature of programming might have to shift to embrace the material property of LLM. It could become a more interpretative, social, and discovery-based activity. Maybe that's what "vibe coding" would eventually become.

archy_ · 5 months ago
C has a lot of ambiguity in how it is parsed ("undefined behavior") but people usually view that as a benefit because it allows compilers more freedom to dictate an implementation.
lxgr · 5 months ago
> The only big difference is that high level programming languages are still deterministic but natural language is not.

Arguably, determinism isn't everything in programming: It's very possible to have perfectly deterministic, yet highly surprising (in terms of actual vs. implied semantics to a human reader) code.

In other words, the axis "high/low level of abstraction" is orthogonal to the "deterministic/probabilistic" one.

raincole · 5 months ago
Yes, but determinism is still very important in this case. It means you only need to memorize the surprising behavior once (like literally every single senior programmer has memorized their programming language's quirks even they don't want to).

Without determinism, learning becomes less rewarding.

tossandthrow · 5 months ago
A program with ambiguities will not work, a spec with ambiguities is, on the other hand, incredibly common.

Specs are not more abstract but more ambiguous, which is not the same thing.

drdrek · 5 months ago
Somehow many very smart AI entrepreneurs do not understand the concept of limits to lossless data compression. If an idea cannot be reduced further without losing information, no amount of AI is going to be able to compress it.

This is why you see so many failed startup around slack/email/jira efficiency. Half the time you do not know if you missed critical information so you need to go to the source, negating gains you had with information that was successfully summarized.

dorkrawk · 5 months ago
Downloading music off the internet is just the next logical step after taping songs off the radio. Cassette tapes didn't really affect the music industry, so I wouldn't worry about this whole Napster thing.
wkirby · 5 months ago
I see this as the next great wave of work for me and my team. We sustained our business for a good 5–8 years on rescuing legacy code from offshore teams as small-to-medium sized companies re-shored their contract devs. We're currently in a demand lull as these same companies have started relying heavily on LLMs to "write" "code" --- but as long as we survive the next 18 months, I see a large opportunity as these businesses start to feel the weight of their accumulated tech debt accrued by trusting claude when it says "your code is now production ready."
meander_water · 5 months ago
I've done my share of vibe coding, and I completely agree with OP.

You just don't build up the necessary mental model of what the code does when vibing, and so although you saved time generating the code, you lose all that anyway when you hit a tricky bug and have to spend time building up the mental model to figure out what's wrong.

And saying "oh just do all the planning up front" just doesn't work in the real world where requirements change every minute.

And if you ever see anyone using "accepted lines" as a metric for developer productivity/hours saved, take it with a grain of salt.

miguelacevedo · 5 months ago
Agree, Peter Naur famously said programming is theory building. Code you do not understand can be considered dead code.
CaptainOfCoit · 5 months ago
> I've done my share of vibe coding

Why? It was almost meant in jest and as a joke, no one seriously believes you don't need to review code, you end up in spaghetti land so quickly I can't believe anyone tried "vibe coding" for more than a couple of hours then didn't quickly give up on something that is obviously infeasible.

Now, reviewing whatever the LLM gives you back, carefully massage it into the right shape then moving on, definitely helps my programming a lot, but careful review is needed that the LLM had the right context so it's actually correct. But then we're in "pair programming" territory rather than blindly accepting whatever the LLM hands you, AKA "vibe coding".

meander_water · 5 months ago
Vibe coding has its place. I've mainly used it to create personalised ui's for tasks very specific to me. I don't write tests, I may throw it away next week but it's served its purpose at least once. Is this grossly inefficient? Probably.