So much dead and useless code generated by these tools... and tens of thousands of lines of worthless tests..
honestly I don't mind it that much... my changed lines is through the roof relative to my peers and now that stack ranking is back........
So much dead and useless code generated by these tools... and tens of thousands of lines of worthless tests..
honestly I don't mind it that much... my changed lines is through the roof relative to my peers and now that stack ranking is back........
If an AI generates a process more quickly than a human, and the process can be run deterministically, and the outputs are testable, then the process can run without direct human supervision after initial testing - which is how most automated processes work.
The testing should happen anyway, so any speed increase in process generation is a productivity gain.
Human monitoring only matters if the AI is continually improvising new solutions to dynamic problems and the solutions are significantly wrong/unreliable.
Which is a management/analysis problem, and no different in principle to managing a team.
The key difference in practice is that you can hire and fire people on a team, you can intervene to change goals and culture, and you can rearrange roles.
With an agentic workflow you can change the prompts, use different models, and redesign the flow. But your choices are more constrained.
That means that, with the current technology, there can never be a deterministic agent.
Now obviously, humans aren't deterministic either, but the error bars are a lot closer together than they are with LLMs these days.
An easy to point at example is the coding agent that removed someones home directory that was circulating around. I'm not saying a human has never done that, but it's far less likely because it's so far out of the realm of normal operations.
So as of today, we need humans in the loop. And this is understood by the people making these products. That's why they have all these permissions and prompts for you to accept/run commands and all of that.
And forget scripting languages, take a C program that writes a string to disk and reads it back.
How many times longer does it get the moment we have to ensure the string was actually committed to non-volatile NAND and actually read back? 5x? 10x?
Is it even doable if we have to support arbitrary consumer hardware?
First of all, I pick the hardware I support and the operating systems. I can make those things requirements when they are required.
But when you boil down your argument, it's that because one thing may introduce non-determinism, then any degree of non-determinism is acceptable.
At that point we don't even need LLMs. We can just have the computer do random things.
It's just a rehash of the infinite monkeys with infinite type writers which is ridiculous
I think a lot of prompt engineering is voodoo, but it's not all baseless: a more formal way to look at it is aligning your task with the pre-training and post-training of the model.
The whole "it's a bad language" refrain feels half-baked when most of us use relatively high level languages on non-realtime OSes that obfuscate so much that they might as well be well worded prompts compared to how deterministic the underlying primitives they were built on are... at least until you zoom in too far.
I actually think it's great for giving non-programmers the ability to program to solve basic problems. That's really cool and it's pretty darn good at it.
I would refute that you get SOTA results.
That has never been my personal experience. Given that we don't see a large increase in innovative companies spinning up now that this technology is a few years old, I doubt it's the experience of most users.
> The whole "it's a bad language" refrain feels half-baked when most of us use relatively high level languages on non-realtime OSes that obfuscate so much that they might as well be well worded prompts compared to how deterministic the underlying primitives they were built on are... at least until you zoom in too far.
Obfuscation and abstraction are not the same thing. The other core difference is the precision and the determinism both of which are lacking with LLMs.
This is the pattern I settled on about a year ago. I use it as a rubber-duck / conversation partner for bigger picture issues. I'll run my code through it as a sanity "pre-check" before a pr review. And I mapped autocomplete to ctrl-; in vim so I only bring it up when I need it.
Otherwise, I write everything myself. AI written code never felt safe. It adds velocity but velocity early on always steals speed from the future. That's been the case for languages, for frameworks, for libraries, it's no different for AI.
In other words, you get better at using AI for programming by recognizing where its strengths lie and going all in on those strengths. Don't twist up in knots trying to get it to do decently what you can already do well yourself.
“Prompt engineering” just seems dumb as hell. It’s literally just an imprecise nondeterministic programming language.
Before a couple years so, we all would have said that was a bad language and moved on.
That said, the water issue is overblown. Most of the water calculation comes from power generation (which uses a ton) and is non-potable water.
The potable water consumed is not zero, but it’s like 15% or something
The big issue is power and the fact that most of it comes from fossil fuels
Modern Java GCs typically offer a boost over more manual memory management. And on latency, even if virtual were very inefficient and you'd add a GC pause with Java's new GCs, you'd still be well below 1ms, i.e. not a dominant factor in a networked program.
(Yes, there's still one cause for potential lower throughput in Java, which is the lack of inlined objects in arrays, but that will be addressed soon, and isn't a big factor in most server applications anyway or related to IO)
BTW, writing a program in C++ has always been more or less as easy as writing it in Java/C# etc.; the big cost of C++ is in evolution and refactoring over many years, because in low-level languages local changes to code have a much more global impact, and that has nothing to do with the design of the language but is an essential property of tracking memory management at the code level (unless you use smart pointers, i.e. a refcounting GC for everything, but then things will be really slow, as refcounting does sacrifice performance in its goal of minimising footprint).
Modern gcs can be pauseless, but either way you’re spending CPU on gc and not servicing requests/customers.
As for c++, std::unique_ptr has no ref counting at all.
shared_ptr does, but that’s why you avoid it at all costs if you need to move things around. you only pay the cost when copying the shared_ptr itself, but you almost never need a shared_ptr and even when you need it, you can always avoid copying in the hot path
What's the base rate of humans rm -rf'ing their own work?
[0] https://blog.toolprint.ai/p/i-asked-claude-to-wipe-my-laptop
That's literally exactly the kind of non-determinism I'm talking about. If he'd just left the agent to it's own devices, the exact same thing would have happened.
now you may argue this highlights that people make catastrophic mistakes too, but I'm not sure i agree.
Or at least, they don't often make that kind of mistake. Not saying that they don't make any catastrophic mistakes (they obviously do....)
We know people tend to click "accept" on these kinds of permission prompts with only a cursory read of what it's doing. And the more of these prompts you get, the more likely you are to just click "yes" or whatever to get through it..
If anything this kind of perfectly highlights some of the ironies referenced in the post itself.