That, and we also don't only focus on the textual description of a problem when we encounter a problem. We don't see the debugger output and go "how do I make this bad output go away?!?". Oh, I am getting an authentication error. Well, meaybe I should just delete the token check for that code path...problem solved?!
No. Problem very much not-solved. In fact, problem very much very bigger big problem now, and [Grug][1] find himself reaching for club again.
Software engineers are able to step back, think about the whole thing, and determine the root cause of a problem. I am getting an auth error...ok, what happens when the token is verified...oh, look, the problem is not the authentication at all...in fact there is no error! The test was simply bad and tried to call a higher privilege function as a lower privilege user. So, test needs to be fixed. And also, even though it isn't per-se an error, the response for that function should maybe differentiate between "401 because you didn't authenticate" and "401 because your privileges are too low".
If this is how you think LLMs and Coding Agents are going about writing code, you haven't been using the right tools. Things happen, sure, but also mostly don't. Nobody is arguing that LLM-written code should be pushed directly into production, or that they'll solve every task.
LLMs are tools, and everyone eventually figures out a process that works best for them. For me, it was strongs specs/docs, strict types, and lots of tests. And then of course the reviews if it's serious work.
Llms are really good at template tasks, writing tests, boilerplate etc. But, Most times I'm not doing implement this button. I'm doing there's a logic mismatch in my expectation
There's a large variance in outcomes depending on the prompt, and the process. I've gotten it to do things which are harder than a filescan with a skipped directory - without too much trouble.
Add:
> Llms are really good at template tasks, writing tests, boilerplate etc.
If I have to stretch the definition of boilerplate to what's at the edge of a modern LLM's comprehension, I would say that 50% of software is some sort of boilerplate.