I don't understand the productivity that people get out of these AI tools. I've tried it and I just can't get anything remotely worthwhile unless it's something very simple or something completely new being built from the ground up.
Like sure, I can ask claude to give me the barebones of a web service that does some simple task. Or a webpage with some information on it.
But any time I've tried to get AI services to help with bugfixing/feature development on a large, complex, potentially multi-language codebase, it's useless.
And those tasks are the ones that actually take up the majority of my time. On the occasion that I'm spinning a new thing up quickly, I don't really need an AI to do it for me -- I mean, that's the easy part!
Is there something I'm missing? Am I just not using it right? I keep seeing people talk about how addictive it is, how the productivity boost is insane, how all their code is now written by AI and then audited, and I just don't see how that's possible outside of really simple rote programming.
The first and most important question to ask here is: are you using a coding agent? A lot of times, people who aren't getting much out of LLM-assisted coding are just asking Claude or GPT for code snippets, and pasting and building them themselves (or, equivalently, they're using LLM-augmented autocomplete in their editor).
Almost everybody doing serious work with LLMs is using an agent, which means that the LLM is authoring files, linting them, compiling them, and iterating when it spots problems.
There's more to using LLMs well than this, but this is the high-order bit.
Funny, I would give the absolute opposite advice. In my experience, the use of agents (mainly Cursor) is a sure-fire way to have a really painful experience with LLM-assisted coding. I much prefer to use AI as a pair programmer, that I talk to and sometimes get to write entire files, but I'm always the one doing the driving, and mostly the one writing the code.
If you aren't building up mental models of the problem as you go, you end up in a situation where the LLM gets stuck at the edges of its capability, and you have no idea how even to help it overcome the hurdle. Then you spend hours backtracking through what it's done building up the mental model you need, before you can move on. The process is slower and more frustrating than not using AI in the first place.
I guess the reality is, your luck with AI-assisted coding really comes down to the problem you're working on, and how much of it is prior art the LLM has seen in training.
Yesterday I gave cursor a try and made my first (intentionally very lazy) vibe coding approach (a simple threejs project). It accepted the task and did things, failed, did things, failed, did things ... failed for good.
I guess I could work on the magic incantations to tweak here and there a bit until it works and I guess that's the way it is done. But I wasn't hooked.
I do get value out of LLM's for isolated broken down subtasks, where asking a LLM is quicker than googling.
For me, AI will probably become really usefull, once I can scan and integrate my own complex codebase so it gives me solutions that work there and not hallucinate API points or jump between incompatible libary versions (my main issue).
I did almost the same thing and had pretty much the same experience. A lot of times it felt so close to being great, but it ultimately wasted more time than if I had just worked on the project and occasionally asked chat GPT to generate some sample code to figure out an API
I've had the same issue every time I've tried it. The code I generally work on is embedded C/C++ with in-house libraries where the tools are less than useful as they try to generate non-existant interfaces and generally generate worse code than I'd write by hand. There's a need for correctness and being able to explain the code thus use of those tools is also detrimental to explainability unless I hand-hold it to the point where I'm writing all of the code myself.
Generating function documentation hasn't been that useful either as the doc comments generated offer no insight and often the amount I'd have to write to get it to produce anything of value is more effort than just writing the doc comments myself.
For my personal project in zig they either get lost completely or gives me terrible code (my code isn't _that_ bad!). There seems to be no middle ground here. I've even tried the tools as pair programmers but they often get lost or stuck in loops of repeating the same thing that's already been mentioned (likely falls out of the context window).
When it comes to others using such tools I've had to ask them to stop using it to think as it becomes next to impossible to teach / mentor if they're passing that I say to the LLM or trying to have it perform the work. I'm confident in debugging people when it comes to math / programming but with an LLM between it's just not possible to guess where they went wrong or how to bring them back to the right path as the throught process is lost (or there wasn't one to begin with).
This is not even "vibe coding", I've just never found it generally useful enough to use day-to-day for any task and my primary use of say phind has been to use it as an alternative to qwant when I cannot game the search query well enough to get the search results I'm looking for (i.e I ignore the LLM output and just look at the references).
> I've had the same issue every time I've tried it. The code I generally work on is embedded C/C++ with in-house libraries where the tools are less than useful as they try to generate non-existant interfaces and generally generate worse code than I'd write by hand.
That's because whatever training the model had, it didn't covered anything remotely similar to the codebase you worked on.
We get this issue even with obscure FLOSS libraries.
When we fail to provide context to LLMs, they generate examples by following supperficial queues like coding conventions. In extreme cases, such as code that employs source code generators or templates, LLMs even fill in function bodies that code generators are designed to generate for you. That's because, if LLMs are oblivious to the context, they resort to hallucinate their way into something seemingly coherent. Unless you provide them with context or instruct them not to make up stuff, they will resort to bullshit their way into an example.
What's truly impressive about this is that often times the hallucinated code actually works.
> Generating function documentation hasn't been that useful either as the doc comments generated offer no insight and often the amount I'd have to write to get it to produce anything of value is more effort than just writing the doc comments myself.
Again,this suggest a failure on your side for not providing any context.
If you give it enough context LLMs synthesize and present them almost instantly. If you're prompting a LLM to generate documentation, which boils down to synthesizing what an implementation does and what's their purpose,and the LLM comes up empty, that means you failed to give it anything to work on.
The bulk of your comment screams failure to provide any context. If your code steers far away from what it expects, fails to follow any discernible structure, and doesn't even convey purpose and meaning in little things like naming conventions, you're not giving the LLM anything to work on.
> Is there something I'm missing? Am I just not using it right?
The talk about it makes more sense when you remember most developers are primarily writing CRUD webapps or adware, which is essentially a solved problem already.
They aren't increasing productivity. In the short term.
They are very handy tools that can help you learn a foreign code/base faster. They can help you when you run into those annoying blockers that usually take hours or days or a second set of eyes to figure out. They give you a sounding board and help you ask questions and think about the code more.
Big IF here. IF you bother to read. The danger is some people just keep clicking and re-prompting until something works, but they have zero clue what it is and how it works. This is going to be the biggest problem with AI code editors. People just letting Jesus take the wheel and during this process, inefficient usage of the tools will lead to slower throughput and a higher bill. AI costs a good chunk of change per token and that's only going up.
I do think it's addictive for sure. I also think the "productivity boost" is a feeling people get, but no one measures. I mean, it's hard to measure. Then again, if you do spend an hour on a problem you get stuck on vs 3 days then sure it helped productivity. In that particular scenario. Averaged out? Who knows.
They are useful tools, they are just also very misunderstood and many people are too lazy to take the time to understand them. They read headlines and unsubstantiated claims and get overwhelmed by hype and FOMO. So here we are. Another tech bubble. A super bubble really. It's not that the tools won't be with us for a long time or that they aren't useful. It's that they are way way overvalued right now.
I appreciate you voicing your feelings here. My previous employer requested we try AI tooling for productivity purposes, and I was finding myself in similar scenarios to what you mention. The parts that would have benefitted from a productivity gain weren’t seeing any improvement, while the areas that saw a speedup weren’t terribly mission-critical.
The one thing I really appreciated though was the AI’s ability to do a “fuzzy” search in occasional moments of need. Or, for example, sometimes the colloquial term for a feature didn’t match naming conventions in source code. The AI could find associations in commit messages and review information to save me time rummaging through git-blame. Like I said though, that sort of problem wasn’t necessarily a bottleneck and could often be solved much more cheaply by asking around coworker on Slack.
Probably 80% of the time I spend coding, I'm inside a code file I haven't read in the last month. If I need to spend more than 30 seconds reading a section of code before I understand it, I'll ask AI to explain it to me. Usually, it does a good job of explaining code at a level of complexity that would take me 1-15 minutes to understand, but does a poor job of answering more complex questions or at understanding more complex code.
It's a moderately useful tool for me. I suspect the people that get the most use out of are those that would take more than 1 hour to read code I would take 10 minutes to read. Which is to say the least experienced people get the most value.
I find it's incredibly helpful for prototyping. These tools quickly reach a limit of complexity and put out sub par code, but for a green field prototype that's ok.
I've successfully been able to test out new libraries and do explorations quickly with AI coding tools and I can then take those working examples and fix them up manually to bring them up to my coding standards. I can also extend the lifespan of coding tools by doing cleanup cycles where I manually clean up the code since they work better with cleaner encapsulation, and you can use them to work on one scoped component at a time.
I've found that they're great to test out ideas and learn more quickly, but my goal is to better understand the technologies I'm prototyping myself, I'm not trying to get it to output production quality code.
I do think there's a future where LLMs can operate in a well architected production codebase with proper type safe compilation, linting, testing, encapsulation, code review, etc, but with a very tight leash because without oversight and quality control and correction it'll quickly degrade your codebase.
Incredibly useful for 'glue code' or internal apps that are for automating really annoying processes - but where normally the time it would take to develop those tools would add up and take away from the core work.
For instance, dealing with files that don't quite work correctly between two 3D applications because of slightly different implementations. Ask for a python script to patch the files so that they work correctly – done almost instantly just by describing the problem.
Also for prototyping. Before you spend a month crafting a beautiful codebase, just get something standing up so you can evaluate whether it's worth spending time on – like, does the idea have legs?
90% of programming problems get solved with a rubber ducky – and this is another valuable area. Even if the AI isn't correct, often times just talking it through with an LLM will get you to see what the solution is.
I’ve had good experiences using it, but with the caveat that only Gemini Pro 2.5 has been at all useful, and only for “spot” tasks.
I typically use it to whip up a CLI tool or script to do something that would have been too fiddly otherwise.
While sitting in a Teams meeting I got it to use the Roslyn compiler SDK in a CLI tool that stripped a very repetitive pattern from a code base. Some OCD person had repeated the same nonsense many thousands of times. The tool cleaned up the mess in seconds.
For me it's an additional tool, not the only tool. LSPs are still better for half of the things I do daily (renaming things, extracting things, finding symbols, etc.). I can't imagine using AI for everything and even meeting my current velocity.
I swear the people posting this stuff are just building todo list tier apps from scratch. On even the most moderately large codebase it completely breaks down.
I find that token memory size limits are the main barrier here.
Once the LLM starts forgetting other parts of the application, all bets are off and it will hallucinate the dumbest shit, or even just remove features wholesale.
For my workflow a lot of the benefit is in the smaller tasks. For example, get a medium size diff from two change-sets, paste it in, and ask to summarize the “what” and “why”. You have to learn how to give AI the right amount of context to get the right result back.
For code changes I prefer to paste a single function in, or a small file, or error output from a compile failure. It’s pretty good at helping you narrow things down.
So, for me, it’s a pile of small gains where the value is—because ultimately I know what I generally want to get done and it helps me get there.
Just yesterday I wanted to adjust the Renovate setup in a repo.
I could spend 5-10 minutes digging on through the docs for the correct config option, or I can just tap a hotkey, open up GitHub Copilot in Rider and tell it what I want to achieve.
And within seconds it had a correct-looking setting ready to insert to my renovate.json file. I added it, tested it and it works.
I kinda think people who diss AIs are prompting something like "build me Facebook" and then being disappointed when it doesn't :D
I wish more had been written about the first assertion that using an LLM to code is like gambling and you're always hoping that just one more prompt will get you what you want.
It really captures how little control one has over the process, while simultaneously having the illusion of control.
I don't really believe that code is being made verbose to make more profits. There's probably some element of model providers not prioritizing concise code, but if conciseness while maintaining "quality" was possible is would give one model a sufficient edge over others that I suspect providers would do it.
Agreed, I've been thinking about the first assertion a lot recently as I've been using Cursor to create a react app. I think it's more prevalent in frontend development because it tightens the feedback loop considerably, and the more positive feedback you get, the more conditioned you get to reach for it anytime you need to do anything in code.
I think there's another perverse incentive here - organisations want to produce features/products fast, which LLMs help with, but it comes at the cost of reduced cognitive capabilities/skills in the developers over the longer term as they've given that up through lack of use/practice.
I don't believe there are perverse incentives yet, right now it's arms race burn money and operate at a loss days. There is no moat only quality and price per token and the leader moves around too quickly. Also Author should really look into Cursor at $20 with unlimited slow requests, I imagine paying per token hurts when it spits out garbage even when you've thought you provided enough context but it wasn't enough.
Someone needs to make a plugin to count lines of discard code and prompts
This has certainly been my own experience in life. My step-father was a very studious and responsible person. He worked 30-years from the age of 19 with the state as an HVAC service tech until he retired at 49yo with a full state pension, and then went to work for a private company. His plan was to earn as much as he could until he turned 55, and then retire to live/work on the small farm he and my mother had just purchased. Everything was coming together, his new job placed him in a senior project management position, and gave him a considerable salary compared with the state.
Shortly after he turned 50, he was diagnosed with pancreatic cancer, and he died several months later, following a very painful and difficult attempt to treat it.
In my mind, this kind of thing is the height of tragedy—he did everything right. He exhibited an incredible amount of self-control and deferred his happiness, ensuring that his family and finances were well-cared for and secured, and then having fulfilled his obligations, he was almost immediately robbed of a life that he’d worked so hard to earn.
I experienced a few more object lessons in the same vein myself, namely having been diagnosed with multiple sclerosis at the age of 18, and readjusting my life’s goals to accommodate the prospect of disability. I’m thankfully still churning along under my own capacities, now at 41yo, but MS can be unpredictable, and I find it is necessary to remind myself of this from time to time. I am grateful for every day that I have, and to the extent it’s possible, I try to find nearer-term sources for happiness and fulfillment.
Don’t waste any time planning for more than the next five years (with the obvious exceptions for things like financial planning), as you can’t possible know what’s coming. Even if the unexpected event is a happy one, like an unexpected child or sudden financial windfall, your perspective will almost certainly be dramatically altered 1-2x each decade.
But just like gambling, there are ways to do it correctly.
Yes, there are the grandmas in a trance vibe-gambling by shoving a bucket of quarters in a slot machine.
But you also have people playing Blackjack and beating the averages by knowing how it's played, maybe having a "feel" for the deck (or counting cards...), and most importantly knowing when to fold and walk away.
Same with LLMs, you need to understand context sizes and prompts and you need to have a feel for when the model is just chasing its own tail or trying to force a "solution" just to please the user.
While I get your point, this also kinda sounds like a gambling addict trying to explain how they're not an addict and how they're losing money the correct way, heh.
These perverse incentives run at the heart of almost all Developer Software as a Service tooling. Using someone else's hosted model incentivizes increasing token usage, but it's nothing special about AI.
Consider Database-as-a-service companies: They're not incentivized to optimize on CPU usage, they charge per cpu. They're not incentivized to improve disk compression, they charge for disk-usage. There are several DB vendors who explicitly disable disk compression and happily charge for storage capacity.
When you run the software yourself, or the model yourself, the incentives aligned: use less power, use less memory, use less disk, etc.
> When you run the software yourself, or the model yourself, the incentives aligned: use less power, use less memory, use less disk, etc.
But my team's time is soooo valuable. It's sooo sooo sooo valuable. Oh and we can't afford to hire anyone else either. But our time its sooo valuable. We need these tools!
My favourite example of this is the recent trend towards “wide events” replacing logs and metrics… spearheaded and popularised by companies that charge by the gigabytes ingested.
1. Yes. I've spent several late nights nudging Cline and Claude (and other systems) to the right answers. And being able to use AWS Bedrock to do this has been great (note: I work at Amazon).
2. I've had good fortunes keeping the agents to constrained areas, working on functions, or objects, with clearly defined (by me) boundaries. If the measure of a junior engineer is that you correct them once a day, an engineer once a week, a senior once a month, a principal once a quarter... Treat these agents like hyper-energetic interns. Nudge frequently.
3. Standard org management coding practices apply. Force the agents to show work, plan, unit test, investigate.
And, basically, I've described that we're becoming Software Development Managers with teams of on-demand low-quality interns. That's an incredibly powerful tool, but don't expect hyper-elegant and compact code from them. Keep that for the senior engineering staff (humans) for now.
(Note: The AlphaEvolve announcement makes me wonder if I'm going to have hyper-energetic applied science interns next...)
I feel like "vibe coding" as a "no look" sort of way to produce anything is bad and will probably remain bad for some time.
However... "vibe architecting" is likely going to be the way forward. I have had success with generating/tuning an architecture plan with AI, having it create stub files/functions then filling them out individually. I can get pretty much the whole way without typing code, but it does require a fair bit more architectural thinking than usual and a good bit of reading code (then telling the AI to "do better").
I think of it like the analogy of blind men describing an elephant when they can only feel a single part. AI is decent at high level architecture and decent at low level production but you need a human to understand the big picture and how the pieces fit (and which ones are missing).
What you are talking about is the "proper" way of vibe coding. Most of the issues with vibe coding stem from user misunderstanding the capabilities of the technology they are using. They are overestimating the capabilities of current systems and are essentially asking for magic to happen. They don't give proper guidance, context or anything of value for the coding IDE to work with. They are relying a mindset of the 2030's to work with systems from 2025. We aint there yet folks, give as much guidance and context as you can and you will have a better time.
Amusingly, about 90% of my rat's-nest problems with Sonnet 3.7 are solved by simply appending a few words to the end of the prompt:
"write minimum code required"
It's not even that sensitive to the wording - "be terse" or "make minimal changes" amount to the same thing - but the resulting code will often be at least 50% shorter than the un-guided version.
The study the article cited is specifically about when asking the LLMs about misinformation. I think on coding tasks and such shorter answers are usually more accurate.
There are two sets of perverse incentives at play. The main one the author focuses on is that LLM companies are incentivized to produce verbose answers, so that when you task an LLM on extending an already verbose project, the tokens used and therefore cost increases.
The second one is more intra/interpersonal: under pressure to produce, it's very easy to rely on LLMs to get one 80% of the way there and polish the remaining 20%. I'm in a new domain that requires learning a new language. So something I've started doing is asking ChatGPT to come up with exercises / coding etudes / homework for me based on past interactions.
Like sure, I can ask claude to give me the barebones of a web service that does some simple task. Or a webpage with some information on it.
But any time I've tried to get AI services to help with bugfixing/feature development on a large, complex, potentially multi-language codebase, it's useless.
And those tasks are the ones that actually take up the majority of my time. On the occasion that I'm spinning a new thing up quickly, I don't really need an AI to do it for me -- I mean, that's the easy part!
Is there something I'm missing? Am I just not using it right? I keep seeing people talk about how addictive it is, how the productivity boost is insane, how all their code is now written by AI and then audited, and I just don't see how that's possible outside of really simple rote programming.
Almost everybody doing serious work with LLMs is using an agent, which means that the LLM is authoring files, linting them, compiling them, and iterating when it spots problems.
There's more to using LLMs well than this, but this is the high-order bit.
If you aren't building up mental models of the problem as you go, you end up in a situation where the LLM gets stuck at the edges of its capability, and you have no idea how even to help it overcome the hurdle. Then you spend hours backtracking through what it's done building up the mental model you need, before you can move on. The process is slower and more frustrating than not using AI in the first place.
I guess the reality is, your luck with AI-assisted coding really comes down to the problem you're working on, and how much of it is prior art the LLM has seen in training.
I guess I could work on the magic incantations to tweak here and there a bit until it works and I guess that's the way it is done. But I wasn't hooked.
I do get value out of LLM's for isolated broken down subtasks, where asking a LLM is quicker than googling.
For me, AI will probably become really usefull, once I can scan and integrate my own complex codebase so it gives me solutions that work there and not hallucinate API points or jump between incompatible libary versions (my main issue).
Generating function documentation hasn't been that useful either as the doc comments generated offer no insight and often the amount I'd have to write to get it to produce anything of value is more effort than just writing the doc comments myself.
For my personal project in zig they either get lost completely or gives me terrible code (my code isn't _that_ bad!). There seems to be no middle ground here. I've even tried the tools as pair programmers but they often get lost or stuck in loops of repeating the same thing that's already been mentioned (likely falls out of the context window).
When it comes to others using such tools I've had to ask them to stop using it to think as it becomes next to impossible to teach / mentor if they're passing that I say to the LLM or trying to have it perform the work. I'm confident in debugging people when it comes to math / programming but with an LLM between it's just not possible to guess where they went wrong or how to bring them back to the right path as the throught process is lost (or there wasn't one to begin with).
This is not even "vibe coding", I've just never found it generally useful enough to use day-to-day for any task and my primary use of say phind has been to use it as an alternative to qwant when I cannot game the search query well enough to get the search results I'm looking for (i.e I ignore the LLM output and just look at the references).
That's because whatever training the model had, it didn't covered anything remotely similar to the codebase you worked on.
We get this issue even with obscure FLOSS libraries.
When we fail to provide context to LLMs, they generate examples by following supperficial queues like coding conventions. In extreme cases, such as code that employs source code generators or templates, LLMs even fill in function bodies that code generators are designed to generate for you. That's because, if LLMs are oblivious to the context, they resort to hallucinate their way into something seemingly coherent. Unless you provide them with context or instruct them not to make up stuff, they will resort to bullshit their way into an example.
What's truly impressive about this is that often times the hallucinated code actually works.
> Generating function documentation hasn't been that useful either as the doc comments generated offer no insight and often the amount I'd have to write to get it to produce anything of value is more effort than just writing the doc comments myself.
Again,this suggest a failure on your side for not providing any context.
If you give it enough context LLMs synthesize and present them almost instantly. If you're prompting a LLM to generate documentation, which boils down to synthesizing what an implementation does and what's their purpose,and the LLM comes up empty, that means you failed to give it anything to work on.
The bulk of your comment screams failure to provide any context. If your code steers far away from what it expects, fails to follow any discernible structure, and doesn't even convey purpose and meaning in little things like naming conventions, you're not giving the LLM anything to work on.
The talk about it makes more sense when you remember most developers are primarily writing CRUD webapps or adware, which is essentially a solved problem already.
They are very handy tools that can help you learn a foreign code/base faster. They can help you when you run into those annoying blockers that usually take hours or days or a second set of eyes to figure out. They give you a sounding board and help you ask questions and think about the code more.
Big IF here. IF you bother to read. The danger is some people just keep clicking and re-prompting until something works, but they have zero clue what it is and how it works. This is going to be the biggest problem with AI code editors. People just letting Jesus take the wheel and during this process, inefficient usage of the tools will lead to slower throughput and a higher bill. AI costs a good chunk of change per token and that's only going up.
I do think it's addictive for sure. I also think the "productivity boost" is a feeling people get, but no one measures. I mean, it's hard to measure. Then again, if you do spend an hour on a problem you get stuck on vs 3 days then sure it helped productivity. In that particular scenario. Averaged out? Who knows.
They are useful tools, they are just also very misunderstood and many people are too lazy to take the time to understand them. They read headlines and unsubstantiated claims and get overwhelmed by hype and FOMO. So here we are. Another tech bubble. A super bubble really. It's not that the tools won't be with us for a long time or that they aren't useful. It's that they are way way overvalued right now.
The one thing I really appreciated though was the AI’s ability to do a “fuzzy” search in occasional moments of need. Or, for example, sometimes the colloquial term for a feature didn’t match naming conventions in source code. The AI could find associations in commit messages and review information to save me time rummaging through git-blame. Like I said though, that sort of problem wasn’t necessarily a bottleneck and could often be solved much more cheaply by asking around coworker on Slack.
It's a moderately useful tool for me. I suspect the people that get the most use out of are those that would take more than 1 hour to read code I would take 10 minutes to read. Which is to say the least experienced people get the most value.
I've successfully been able to test out new libraries and do explorations quickly with AI coding tools and I can then take those working examples and fix them up manually to bring them up to my coding standards. I can also extend the lifespan of coding tools by doing cleanup cycles where I manually clean up the code since they work better with cleaner encapsulation, and you can use them to work on one scoped component at a time.
I've found that they're great to test out ideas and learn more quickly, but my goal is to better understand the technologies I'm prototyping myself, I'm not trying to get it to output production quality code.
I do think there's a future where LLMs can operate in a well architected production codebase with proper type safe compilation, linting, testing, encapsulation, code review, etc, but with a very tight leash because without oversight and quality control and correction it'll quickly degrade your codebase.
For instance, dealing with files that don't quite work correctly between two 3D applications because of slightly different implementations. Ask for a python script to patch the files so that they work correctly – done almost instantly just by describing the problem.
Also for prototyping. Before you spend a month crafting a beautiful codebase, just get something standing up so you can evaluate whether it's worth spending time on – like, does the idea have legs?
90% of programming problems get solved with a rubber ducky – and this is another valuable area. Even if the AI isn't correct, often times just talking it through with an LLM will get you to see what the solution is.
I typically use it to whip up a CLI tool or script to do something that would have been too fiddly otherwise.
While sitting in a Teams meeting I got it to use the Roslyn compiler SDK in a CLI tool that stripped a very repetitive pattern from a code base. Some OCD person had repeated the same nonsense many thousands of times. The tool cleaned up the mess in seconds.
What were they doing?
I find that token memory size limits are the main barrier here.
Once the LLM starts forgetting other parts of the application, all bets are off and it will hallucinate the dumbest shit, or even just remove features wholesale.
For code changes I prefer to paste a single function in, or a small file, or error output from a compile failure. It’s pretty good at helping you narrow things down.
So, for me, it’s a pile of small gains where the value is—because ultimately I know what I generally want to get done and it helps me get there.
I could spend 5-10 minutes digging on through the docs for the correct config option, or I can just tap a hotkey, open up GitHub Copilot in Rider and tell it what I want to achieve.
And within seconds it had a correct-looking setting ready to insert to my renovate.json file. I added it, tested it and it works.
I kinda think people who diss AIs are prompting something like "build me Facebook" and then being disappointed when it doesn't :D
Also you have to learn to talk to it and how to ask it things.
"The programming language of the future will be English"
---
"Well are you using it right? You have to know how to use it"
It really captures how little control one has over the process, while simultaneously having the illusion of control.
I don't really believe that code is being made verbose to make more profits. There's probably some element of model providers not prioritizing concise code, but if conciseness while maintaining "quality" was possible is would give one model a sufficient edge over others that I suspect providers would do it.
I think there's another perverse incentive here - organisations want to produce features/products fast, which LLMs help with, but it comes at the cost of reduced cognitive capabilities/skills in the developers over the longer term as they've given that up through lack of use/practice.
Someone needs to make a plugin to count lines of discard code and prompts
This is actually a big insight about life, that in some eastern philosophies, you are supposed to arrive to
We love the illusion of control, even though we don’t really have it. Life mostly just unfolds as we experience it
Shortly after he turned 50, he was diagnosed with pancreatic cancer, and he died several months later, following a very painful and difficult attempt to treat it.
In my mind, this kind of thing is the height of tragedy—he did everything right. He exhibited an incredible amount of self-control and deferred his happiness, ensuring that his family and finances were well-cared for and secured, and then having fulfilled his obligations, he was almost immediately robbed of a life that he’d worked so hard to earn.
I experienced a few more object lessons in the same vein myself, namely having been diagnosed with multiple sclerosis at the age of 18, and readjusting my life’s goals to accommodate the prospect of disability. I’m thankfully still churning along under my own capacities, now at 41yo, but MS can be unpredictable, and I find it is necessary to remind myself of this from time to time. I am grateful for every day that I have, and to the extent it’s possible, I try to find nearer-term sources for happiness and fulfillment.
Don’t waste any time planning for more than the next five years (with the obvious exceptions for things like financial planning), as you can’t possible know what’s coming. Even if the unexpected event is a happy one, like an unexpected child or sudden financial windfall, your perspective will almost certainly be dramatically altered 1-2x each decade.
Yes, there are the grandmas in a trance vibe-gambling by shoving a bucket of quarters in a slot machine.
But you also have people playing Blackjack and beating the averages by knowing how it's played, maybe having a "feel" for the deck (or counting cards...), and most importantly knowing when to fold and walk away.
Same with LLMs, you need to understand context sizes and prompts and you need to have a feel for when the model is just chasing its own tail or trying to force a "solution" just to please the user.
Consider Database-as-a-service companies: They're not incentivized to optimize on CPU usage, they charge per cpu. They're not incentivized to improve disk compression, they charge for disk-usage. There are several DB vendors who explicitly disable disk compression and happily charge for storage capacity.
When you run the software yourself, or the model yourself, the incentives aligned: use less power, use less memory, use less disk, etc.
But my team's time is soooo valuable. It's sooo sooo sooo valuable. Oh and we can't afford to hire anyone else either. But our time its sooo valuable. We need these tools!
- Premature optimization is the root of all evil, can't waste expensive dev hours on that..
2. I've had good fortunes keeping the agents to constrained areas, working on functions, or objects, with clearly defined (by me) boundaries. If the measure of a junior engineer is that you correct them once a day, an engineer once a week, a senior once a month, a principal once a quarter... Treat these agents like hyper-energetic interns. Nudge frequently.
3. Standard org management coding practices apply. Force the agents to show work, plan, unit test, investigate.
And, basically, I've described that we're becoming Software Development Managers with teams of on-demand low-quality interns. That's an incredibly powerful tool, but don't expect hyper-elegant and compact code from them. Keep that for the senior engineering staff (humans) for now.
(Note: The AlphaEvolve announcement makes me wonder if I'm going to have hyper-energetic applied science interns next...)
However... "vibe architecting" is likely going to be the way forward. I have had success with generating/tuning an architecture plan with AI, having it create stub files/functions then filling them out individually. I can get pretty much the whole way without typing code, but it does require a fair bit more architectural thinking than usual and a good bit of reading code (then telling the AI to "do better").
I think of it like the analogy of blind men describing an elephant when they can only feel a single part. AI is decent at high level architecture and decent at low level production but you need a human to understand the big picture and how the pieces fit (and which ones are missing).
"write minimum code required"
It's not even that sensitive to the wording - "be terse" or "make minimal changes" amount to the same thing - but the resulting code will often be at least 50% shorter than the un-guided version.
It'll check _EVERY_ edge case separately, even in situations where it will never ever happen and if it does, it's a NOP anyway.
The second one is more intra/interpersonal: under pressure to produce, it's very easy to rely on LLMs to get one 80% of the way there and polish the remaining 20%. I'm in a new domain that requires learning a new language. So something I've started doing is asking ChatGPT to come up with exercises / coding etudes / homework for me based on past interactions.