Every time I say I don't see the productivity boost from AI, people always say I'm using the wrong tool, or the wrong model. I use Claude with Sonnet, Zed with either Claude Sonnet 4 or Opus 4.6, Gemini, and ChatGPT 5.2. I use these tools daily and I just don't see it.
The vampire in the room, for me, seems to be feeling like I'm the only person in the room that doesn't believe the hype. Or should I say, being in rooms where nobody seems to care about quality over quantity anymore. Articles like this are part of the problem, not the solution.
Sure they are great for generating some level of code, but the deeper it goes the more it hallucinates. My first or second git commit from these tools is usually closer to a working full solution than the fifth one. The time spent refactoring prompts, testing the code, repeating instructions, refactoring naive architectural decisions and double checking hallucinations when it comes to research take more than the time AI saves me. This isn't free.
A CTO this week told me he can't code or brainstorm anymore without AI. We've had these tools for 4 years, like this guy says - either AI or the competition eats you. So, where is the output? Aside from more AI-tools, what has been released in the past 4 years that makes it obvious looking back that this is when AI became available?
I am with you on this, and you can't win, because as soon as you voice this opinion you get overwhelmed with "you dont have the sauce/prompt" opinions which hold an inherent fallacy because they assume you are solving the same problems as them.
I work in GPU programming, so there is no way in hell that JavaScript tools and database wrapper tasks can be on equal terms with generating for example Blackwell tcgen05 warp-scheduled kernels.
There's going to be a long tail of domain-specific tasks that aren't well served by current models for the foreseeable future, but there's also no question the complexity horizon of the SotA models is increasing over time. I've had decent results recently with non-trivial Cuda/MPS code. Is it great code/finely tuned? Probably not but it delivered on the spec and runs fast enough.
Many engineers get paid a lot of money to write low-complexity code gluing things together and tweaking features according to customer requirements.
When the difficulty of a task is neatly encompassed in a 200 word ticket and the implementation lacks much engineering challenge, AI can pretty reliably write the code-- mediocre code for mediocre challenges.
A huge fraction of the software economy runs on CRUD and some business logic. There just isn't much complexity inherent in any of the feature sets.
Complexity is not where the value to the business comes from. In fact, it's usually the opposite. Nobody wants to maintain slop, and whenever you dismiss simplicity you ignore all the heroic hard work done by those at the lower level of indirection. This is what politics looks like when it finally places its dirty hands on the tech industry, and it's probably been a long time coming.
As annoying as that is, we should celebrate a little that the people who understand all this most deeply are gaining real power now.
Yes, AI can write code (poorly), but the AI hype is now becoming pure hate against the people who sit in meetings quietly gathering their thoughts and distilling it down to the simple and almost poetic solutions nobody else but those who do the heads down work actually care about.
> A huge fraction of the software economy runs on CRUD and some business logic.
You vastly underestimate the meaning of CRUD applied in such a direct manner. You're right in some sense that "we have the technology", but we've had this technology for a very long time now. The business logic is pure gold. You dismiss this not realizing how many other thriving and well established industries operate doing simple things applied precisely.
I don't understand what including the time of "4 years" does for your arguments here. I don't think anyone is arguing that the usefulness of these AIs for real projects started at GPT 3.5/4. Do you think the capabilities of current AIs are approximately the same as GPT 3.5/4 4 years ago (actually I think SOTA 4 years ago today might have been LaMDA... as GPT 3.5 wasn't out yet)?
> I don't think anyone is arguing that the usefulness of these AIs for real projects started at GPT 3.5/4
Only not in retrospect. But the arguments about "if you're not using AI you're being left behind" did not depend on how people in 2026 felt about those tools retrospectively. Cursor is 3 years old and ok 4 years might be an exaggeration but I've definitely been seeing these arguments for 2-3 years.
Yeah. I started integrating AI into my daily workflows December 2024. I would say AI didn't become genuinely useful until around September 2025, when Sonnet 4.5 came out. The Opus 4.5 release in November was the real event horizon.
> I use these tools daily and I just don't see it.
So why use them if you see no benefit?
You can refuse to use it, it's fine. You can also write your code in notepad.exe, without a linter, and without an Internet connection if you want. Your rodeo
I didn't say I see no benefit, I said I don't see the productivity boost people talk about. I conceded they are good for some things, and presumably that's what I use them for.
> You can refuse to use it, it's fine
Where do you work? Because increasingly, this isn't true. A lot of places are judging engineers by LoC output like it's 2000, except this time the LoC has to come from AI
I have the same experience and still use it. It's just that I learned to use it for simplistic work. I sometimes try to give it more complex tasks but it keeps failing. I don't think it's bad to keep trying, especially as people are reporting insane productivity gains.
After all, it's through failure that we learn the limitations of a technology. Apparently some people encounter that limit more often than others.
What things (languages etc.) do you work with/on primarily?
I don't know what to say, except that I see a substantial boost. I generally code slowly, but since GPT-5.1 was released, what would've taken me months to do now takes me days.
Admittedly, I work in research, so I'm primarily building prototypes, not products.
1. Copy-pasting existing working code with small variations. If the intended variation is bigger then it fails to bring productivity gains, because it's almost universally wrong.
2. Exploring unknown code bases. Previously I had to curse my way through code reading sessions, now I can find information easily.
3. Google Search++, e.g. for deciding on tech choices. Needs a lot of hand holding though.
... that's it? Any time I tried doing anything more complex I ended up scrapping the "code" it wrote. It always looked nice though.
>> 1. Copy-pasting existing working code with small variations. If the intended variation is bigger then it fails to bring productivity gains, because it's almost universally wrong.
This does not match my experience. At all. I can throw extremely large and complex things at it and it nails them with very high accuracy and precision in most cases.
Here's an example: when Opus 4.5 came out I used it extensively to migrate our database and codebase from a one-Postgres-schema-per-tenant architecture to a single schema architecture. We are talking about eight years worth of database operations over about two dozen interconnected and complex domains. The task spanned migrating data out of 150 database tables for each tenant schema, then validating the integrity at the destination tables, plus refactoring the entire backend codebase (about 250k lines of code), plus all of the test suite. On top of that, there were also API changes that necessitated lots of tweaks to the frontend.
This is a project that would have taken me 4-6 months easily and the extreme tediousness of it would probably have burned me out. With Opus 4.5 I got it done in a couple of weeks, mostly nights and weekends. Over many phases and iterations, it caught, debugged and fixed its own bugs related to the migration and data validation logic that it wrote, all of which I reviewed carefully. We did extensive user testing afterwards and found only one issue, and that was actually a typo that I had made while tweaking something in the API client after Opus was done. No bugs after go-live.
So yeah, when I hear people say things like "it can only handle copy paste with small variations, otherwise it's universally wrong" I'm always flabbergasted.
I'm an AI hipster, because I was confusing engagement for productivity before it was cool. :P
TFA mentions the slot machine aspect, but I think there are additional facets: The AI Junior Dev creates a kind of parasocial relationship and a sense of punctuated progress. I may still not have finished with X, but I can remember more "stuff" happening in the day, so it must've been more productive, right?
Contrast this to the archetypal "an idea for fixing the algorithm came to me in the shower."
I also don’t believe the hype. The boosters always say I would believe if I were to just experience it. But that’s like saying all I have to do is eat a hamburger to experience how nutritious it is for me.
I love hamburgers, and nothing in my experience tells me I shouldn’t eat them every day. But people have studied them over time and I trust that mere personal satisfaction is insufficient basis for calling hamburgers healthy eating.
Applied to AI: How do you know you have “10x’d?” What is your test process? Just reviewing the test process will reverse your productivity! Therefore, to make this claim you probably are going on trust.
I you have 10x the trust, you will believe anything.
> The vampire in the room, for me, seems to be feeling like I'm the only person in the room that doesn't believe the hype. Or should I say, being in rooms where nobody seems to care about quality over quantity anymore.
If in real life you are noticing the majority of peers that you have rapport with tending towards something that you don't understand, it usually isn't a "them" problem.
It's something for you to decide. Are you special? Or are you fundamentally missing something?
To say I'm the only one is an exaggeration, it's probably more around 50/50 with the 50% in the pro camp being very vocal to the point where it's almost insulting. Being given basic tasks (like, find the last modified file) with "use Claude to get the command" said straight after.
I perfectly accept that it might be a me problem, and this is why I keep exposing myself to these tools, I try to find how they can help me, and I do see it, I just feel like a lot of people ignore the ways these tools harm productivity (and here I mean directly, not some vague "you'll get worse at learning").
I accept your point, and I do take it to heart, and I do keep wondering if I'm missing something
I think Yegge hit the nail on the head: he has an addiction. Opus 4.5 is awesome but the type of stuff Yegge has been saying lately has been... questionable, to say the least. The kids call it getting "one-shotted by AI". Using an AI coding assistant should not be causing a person this much distress.
A lot of smart people think they're "too smart" to get addicted. Plenty of tales of booksmart people who tried heroin and ended up stealing their mother's jewelry for a fix a few months later.
I'm a recovering alcoholic. One thing I learned from therapists etc. along the way is that there are certain personality types with high intelligence, and also higher sensitivity to other things, like noise, emotional challenges, and addictive/compulsive behaviour.
It does not surprise me at all that software engineers are falling into an addiction trap with AI.
All this praise for AI.. I honestly don't get it. I have used Opus 4.5 for work and private projects. My experience is that all of the AIs struggle when the project grows. They always find some kind of local minimum where they cannot get out of but tell you this time their solution will work.. but it doesn't. They waste my time with this behaviour enormously. In the end I always have to do it myself.
Maybe when AIs are able to say: "I don't know how this works" or "This doesn't work like that at all." they will be more helpful.
What I use AIs for is searching for stuff in large codebases. Sometimes I don't know the name or the file name and describe to them what I am looking for. Or I let them generate some random task python/bash script. Or use them to find specific things in a file that a regex cannot find. Simple small tasks.
It might well be I am doing it totally wrong.. but I have yet to see a medium to large sized project with maintainable code that was generated by AI.
At what point does the project outgrow the AI in your experience? I have a 70k LOC backend/frontend/database/docker app that Claude still mostly one shots most features/tasks I throw at it. Perhaps, it's not as good remembering all the intertwined side-effects between functionalities/ui's and I have to let it know "in the calendar view, we must hide it as well", but that takes little time/effort.
Does it break down at some point to the extent that it simply does not finish tasks? Honest question as I saw this sentiment stated previously and assumed that sooner or later I'll face it myself but so far I didn't.
I find that with more complex projects (full-stack application with some 50 controllers, services, and about 90 distinct full-feature pages) it often starts writing code that simply breaks functionality.
For example, had to update some more complex code to correctly calculate a financial penalty amount. The amount is defined by law and recently received an overhaul so we had to change our implementation.
Every model we tried (and we have corporate access and legal allowance to use pretty much all of them) failed to update it correctly. Models would start changing parts of the calculation that didn't need to be updated. After saying that the specific parts shouldn't be touched and to retry, most of them would go right back to changing it again. The legal definition of the calculation logic is, surprisingly, pretty clear and we do have rigorous tests in place to ensure the calculations are correct.
Beyond that, it was frustrating trying to get the models to stick to our coding standards. Our application has developers from other teams doing work as well. We enforce a minimum standard to ensure code quality doesn't suffer and other people can take over without much issue. This standard is documented in the code itself but also explicitly written out in the repository in simple language. Even when explicitly prompting the models to stick to the standard and copy pasting it into the actual chat, it would ignore 50% of it.
The most apt comparison I can make is that of a consultant that always agrees with you to your face but when doing actual work, ignores half of your instructions and you end up running after them to try to minimize the mess and clean up you have to do. It outputs more code but it doesn't meet the standards we have. I'd genuinely be happy to offload tasks to AI so I can focus on the more interesting parts of work I have, but from my experience and that of my colleagues, its just not working out for us (yet).
> At what point does the project outgrow the AI in your experience? I have a 70k LOC backend/frontend/database/docker app that Claude still mostly one shots most features/tasks I throw at it.
How do you do this?
Admittedly, I'm using Copilot, not CC.
I can't get Copilot to finish a refactor properly, let alone a feature. It'll miss an import rename, leave in duplicated code, update half the use cases but not all.. etc. And that's with all the relevant files in context, and letting it search the codebase so it can get more context.
It can talk about DRY, or good factoring, or SOLID, but it only applies them when it feels like it, despite what's in AGENTS.md. I have much better results when I break the task down into small chunks myself and NOT tell it the whole story.
I'm having trouble at 150k, but I'm not sure the issue is that per se, as opposed to the issue of the set of relevant context which is easy to find. The relevant part of the context threatens to bring in disparate parts of the codebase. The easy to find part determines whether a human has to manually curate the context.
I think most of us - if not _all_ of us - don't know how to use these things well yet. And that's OK. It's an entirely new paradigm. We've honed our skills and intuition based on humans building software. Humans make mistakes, sure, but humans have a degree and style of learning and failure patterns we are very familiar with. Humans understand the systems they build to a high degree, this knowledge helps them predict outcomes, and even helps them achieve the goals of their organisation _outside_ writing software.
I kinda keep saying this, but in my experience:
1. You trade the time you'd take to understand the system for time spent testing it.
2. You trade the time you'd take to think about simplifying the system (so you have less code to type) into execution (so you build more in less time).
I really don't know if these are _good_ tradeoffs yet, but it's what I observe. I think it'll take a few years until we truly understand the net effects. The feedback cycles for decisions in software development and business can be really long, several years.
I think the net effects will be positive, not negative. I also think they won't be 10x. But that's just me believing stuff, and it is relatively pointless to argue about beliefs.
> Maybe when AIs are able to say: "I don't know how this works" or "This doesn't work like that at all." they will be more helpful.
Funny you say that, I encountered this in a seemingly simple task. Opus inserted something along the lines of "// TODO: someone with flatbuffers reflection expertise should write this". I actually thought this was better than I anticipated even though the task was specifically related to fbs reflection. And it was because I didn't waste more time and could immediately start rewriting it from scratch.
We're certainly in the middle of a whirlwind of progress. Unfortunately, as AI capabilities increase, so do our expectations.
Suddenly, it's no longer enough to slap something together and call it a project. The better version with more features is just one prompt away. And if you're just a relay for prompts, why not add an agent or two?
I think there won't be a future where the world adapts to a 4-hour day. If your boss or customer also sees you as a relay for prompts, they'll slowly cut you out of the loop, or reduce the amount they pay you. If you instead want to maintain some moat, or build your own money-maker, your working hours will creep up again.
In this environment, I don't see this working out financially for most people. We need to decide which future we want:
1. the one where people can survive (and thrive) without stable employment;
2. the one where we stop automating in favor of stable employment; or
3. the one where only those who keep up stay afloat.
- “Value capture,” as called out in the article. If new tools make engineers 10x more productive, that should be reflected in compensation
- End employment law workarounds like “unlimited PTO,” where your PTO is still limited in practice, but it’s not a defined or accruing benefit
- Protection against dilution of equity for employees
- A seat at the table for workers, not just managers, in the event of layoffs
- Professional ethics and whistleblower protections. Legally-protected strikes if workers decide to refuse on pursuing an ethically or legally dubious product or feature.
I could go on. There are a lot of abuses we put up with because of relatively high salaries, and it is now abundantly clear that the billionaire capital-owning class is dead set on devaluing the work we do to “reduce labor costs.” We can decide not to go along with that.
Some interesting parts in the text. Some not so interesting ones. The author seems to be thinking that he's a big deal it seems, though - a month ago, I did not know who he is. My work environment has never heard of him (SDE at FAANG). Maybe I'm an outlier and he indeed influences the whole expectation management at companies with his writing, or maybe the success (?) of gastown got to him and he thinks he's bigger than he actually is. Time will tell. In any case, the glorification of oneself in an article like that throws me off for some reason.
He's early Amazon early Google so he's seen two companies super scale. Few people last two paradigm shifts so that's no guarantee of credentials. But at the time he was famous for a specific accidentally-public post that exposed people to the amount that Bezos's influence ramified through Amazon and how his choices contrasted with Google's approach to platforms.
Popular blogger from roughly a decade ago. His rants were frequently cited early in my career. I think he’s fallen off in popularity substantially since.
Am I getting Steve's point? It's a bit like what happened with the agricultural revolution.
A long time ago, food took effort to find, and calories were expensive.
Then we had a breakthrough in cost/per/calories.
We got fat, because we can not moderate our food intake. It is killing us.
A long time ago, coding took effort, and programmer productivity was expensive.
Then we had a breakthrough in cost/per/feature.
Now we are exhausted, because we can not moderate our energy and attention expenditure. It is killing us.
He talks about this new tech for extracting more value from engineers as if it were fracking. When they become impermeable you can now inject a mixed high pressure cocktail of AI to get their internal hydrocarbons flowing. It works but now he feels all pumped out. But the vampire metaphor is hopefully better in that blood replenishes if you don't take too much. A succubus may be an improved comparison, in that a creative seed is extracted and depleted, then refills over a refractory period.
The vampire in the room, for me, seems to be feeling like I'm the only person in the room that doesn't believe the hype. Or should I say, being in rooms where nobody seems to care about quality over quantity anymore. Articles like this are part of the problem, not the solution.
Sure they are great for generating some level of code, but the deeper it goes the more it hallucinates. My first or second git commit from these tools is usually closer to a working full solution than the fifth one. The time spent refactoring prompts, testing the code, repeating instructions, refactoring naive architectural decisions and double checking hallucinations when it comes to research take more than the time AI saves me. This isn't free.
A CTO this week told me he can't code or brainstorm anymore without AI. We've had these tools for 4 years, like this guy says - either AI or the competition eats you. So, where is the output? Aside from more AI-tools, what has been released in the past 4 years that makes it obvious looking back that this is when AI became available?
I work in GPU programming, so there is no way in hell that JavaScript tools and database wrapper tasks can be on equal terms with generating for example Blackwell tcgen05 warp-scheduled kernels.
The current leader is Opus 4.5
https://github.com/anthropics/original_performance_takehome
When the difficulty of a task is neatly encompassed in a 200 word ticket and the implementation lacks much engineering challenge, AI can pretty reliably write the code-- mediocre code for mediocre challenges.
A huge fraction of the software economy runs on CRUD and some business logic. There just isn't much complexity inherent in any of the feature sets.
As annoying as that is, we should celebrate a little that the people who understand all this most deeply are gaining real power now.
Yes, AI can write code (poorly), but the AI hype is now becoming pure hate against the people who sit in meetings quietly gathering their thoughts and distilling it down to the simple and almost poetic solutions nobody else but those who do the heads down work actually care about.
> A huge fraction of the software economy runs on CRUD and some business logic.
You vastly underestimate the meaning of CRUD applied in such a direct manner. You're right in some sense that "we have the technology", but we've had this technology for a very long time now. The business logic is pure gold. You dismiss this not realizing how many other thriving and well established industries operate doing simple things applied precisely.
Some Ella Fitzgerald for you: https://youtube.com/watch?v=tq572nNpZcw
Only not in retrospect. But the arguments about "if you're not using AI you're being left behind" did not depend on how people in 2026 felt about those tools retrospectively. Cursor is 3 years old and ok 4 years might be an exaggeration but I've definitely been seeing these arguments for 2-3 years.
> I use these tools daily and I just don't see it.
So why use them if you see no benefit?
You can refuse to use it, it's fine. You can also write your code in notepad.exe, without a linter, and without an Internet connection if you want. Your rodeo
I don't understand the defensiveness.
I didn't say I see no benefit, I said I don't see the productivity boost people talk about. I conceded they are good for some things, and presumably that's what I use them for.
> You can refuse to use it, it's fine
Where do you work? Because increasingly, this isn't true. A lot of places are judging engineers by LoC output like it's 2000, except this time the LoC has to come from AI
After all, it's through failure that we learn the limitations of a technology. Apparently some people encounter that limit more often than others.
I don't know what to say, except that I see a substantial boost. I generally code slowly, but since GPT-5.1 was released, what would've taken me months to do now takes me days.
Admittedly, I work in research, so I'm primarily building prototypes, not products.
Here's what I find Claude Code (Opus) useful for:
1. Copy-pasting existing working code with small variations. If the intended variation is bigger then it fails to bring productivity gains, because it's almost universally wrong.
2. Exploring unknown code bases. Previously I had to curse my way through code reading sessions, now I can find information easily.
3. Google Search++, e.g. for deciding on tech choices. Needs a lot of hand holding though.
... that's it? Any time I tried doing anything more complex I ended up scrapping the "code" it wrote. It always looked nice though.
This does not match my experience. At all. I can throw extremely large and complex things at it and it nails them with very high accuracy and precision in most cases.
Here's an example: when Opus 4.5 came out I used it extensively to migrate our database and codebase from a one-Postgres-schema-per-tenant architecture to a single schema architecture. We are talking about eight years worth of database operations over about two dozen interconnected and complex domains. The task spanned migrating data out of 150 database tables for each tenant schema, then validating the integrity at the destination tables, plus refactoring the entire backend codebase (about 250k lines of code), plus all of the test suite. On top of that, there were also API changes that necessitated lots of tweaks to the frontend.
This is a project that would have taken me 4-6 months easily and the extreme tediousness of it would probably have burned me out. With Opus 4.5 I got it done in a couple of weeks, mostly nights and weekends. Over many phases and iterations, it caught, debugged and fixed its own bugs related to the migration and data validation logic that it wrote, all of which I reviewed carefully. We did extensive user testing afterwards and found only one issue, and that was actually a typo that I had made while tweaking something in the API client after Opus was done. No bugs after go-live.
So yeah, when I hear people say things like "it can only handle copy paste with small variations, otherwise it's universally wrong" I'm always flabbergasted.
― Roy Amara
TFA mentions the slot machine aspect, but I think there are additional facets: The AI Junior Dev creates a kind of parasocial relationship and a sense of punctuated progress. I may still not have finished with X, but I can remember more "stuff" happening in the day, so it must've been more productive, right?
Contrast this to the archetypal "an idea for fixing the algorithm came to me in the shower."
I love hamburgers, and nothing in my experience tells me I shouldn’t eat them every day. But people have studied them over time and I trust that mere personal satisfaction is insufficient basis for calling hamburgers healthy eating.
Applied to AI: How do you know you have “10x’d?” What is your test process? Just reviewing the test process will reverse your productivity! Therefore, to make this claim you probably are going on trust.
I you have 10x the trust, you will believe anything.
There is a well studied cognitive bias: https://en.wikipedia.org/wiki/Illusory_superiority. People tend to think they're special.
> The vampire in the room, for me, seems to be feeling like I'm the only person in the room that doesn't believe the hype. Or should I say, being in rooms where nobody seems to care about quality over quantity anymore.
If in real life you are noticing the majority of peers that you have rapport with tending towards something that you don't understand, it usually isn't a "them" problem.
It's something for you to decide. Are you special? Or are you fundamentally missing something?
I perfectly accept that it might be a me problem, and this is why I keep exposing myself to these tools, I try to find how they can help me, and I do see it, I just feel like a lot of people ignore the ways these tools harm productivity (and here I mean directly, not some vague "you'll get worse at learning").
I accept your point, and I do take it to heart, and I do keep wondering if I'm missing something
A lot of smart people think they're "too smart" to get addicted. Plenty of tales of booksmart people who tried heroin and ended up stealing their mother's jewelry for a fix a few months later.
It does not surprise me at all that software engineers are falling into an addiction trap with AI.
Maybe when AIs are able to say: "I don't know how this works" or "This doesn't work like that at all." they will be more helpful.
What I use AIs for is searching for stuff in large codebases. Sometimes I don't know the name or the file name and describe to them what I am looking for. Or I let them generate some random task python/bash script. Or use them to find specific things in a file that a regex cannot find. Simple small tasks.
It might well be I am doing it totally wrong.. but I have yet to see a medium to large sized project with maintainable code that was generated by AI.
Does it break down at some point to the extent that it simply does not finish tasks? Honest question as I saw this sentiment stated previously and assumed that sooner or later I'll face it myself but so far I didn't.
For example, had to update some more complex code to correctly calculate a financial penalty amount. The amount is defined by law and recently received an overhaul so we had to change our implementation.
Every model we tried (and we have corporate access and legal allowance to use pretty much all of them) failed to update it correctly. Models would start changing parts of the calculation that didn't need to be updated. After saying that the specific parts shouldn't be touched and to retry, most of them would go right back to changing it again. The legal definition of the calculation logic is, surprisingly, pretty clear and we do have rigorous tests in place to ensure the calculations are correct.
Beyond that, it was frustrating trying to get the models to stick to our coding standards. Our application has developers from other teams doing work as well. We enforce a minimum standard to ensure code quality doesn't suffer and other people can take over without much issue. This standard is documented in the code itself but also explicitly written out in the repository in simple language. Even when explicitly prompting the models to stick to the standard and copy pasting it into the actual chat, it would ignore 50% of it.
The most apt comparison I can make is that of a consultant that always agrees with you to your face but when doing actual work, ignores half of your instructions and you end up running after them to try to minimize the mess and clean up you have to do. It outputs more code but it doesn't meet the standards we have. I'd genuinely be happy to offload tasks to AI so I can focus on the more interesting parts of work I have, but from my experience and that of my colleagues, its just not working out for us (yet).
How do you do this?
Admittedly, I'm using Copilot, not CC.
I can't get Copilot to finish a refactor properly, let alone a feature. It'll miss an import rename, leave in duplicated code, update half the use cases but not all.. etc. And that's with all the relevant files in context, and letting it search the codebase so it can get more context.
It can talk about DRY, or good factoring, or SOLID, but it only applies them when it feels like it, despite what's in AGENTS.md. I have much better results when I break the task down into small chunks myself and NOT tell it the whole story.
I kinda keep saying this, but in my experience:
1. You trade the time you'd take to understand the system for time spent testing it.
2. You trade the time you'd take to think about simplifying the system (so you have less code to type) into execution (so you build more in less time).
I really don't know if these are _good_ tradeoffs yet, but it's what I observe. I think it'll take a few years until we truly understand the net effects. The feedback cycles for decisions in software development and business can be really long, several years.
I think the net effects will be positive, not negative. I also think they won't be 10x. But that's just me believing stuff, and it is relatively pointless to argue about beliefs.
Funny you say that, I encountered this in a seemingly simple task. Opus inserted something along the lines of "// TODO: someone with flatbuffers reflection expertise should write this". I actually thought this was better than I anticipated even though the task was specifically related to fbs reflection. And it was because I didn't waste more time and could immediately start rewriting it from scratch.
Suddenly, it's no longer enough to slap something together and call it a project. The better version with more features is just one prompt away. And if you're just a relay for prompts, why not add an agent or two?
I think there won't be a future where the world adapts to a 4-hour day. If your boss or customer also sees you as a relay for prompts, they'll slowly cut you out of the loop, or reduce the amount they pay you. If you instead want to maintain some moat, or build your own money-maker, your working hours will creep up again.
In this environment, I don't see this working out financially for most people. We need to decide which future we want:
1. the one where people can survive (and thrive) without stable employment;
2. the one where we stop automating in favor of stable employment; or
3. the one where only those who keep up stay afloat.
Should a strike happen if devs are told to use Claude, or should a strike happen if devs aren't given access to Claude?
- End H1B abuse and demand limits on offshoring
- Compensation transparency
- “Value capture,” as called out in the article. If new tools make engineers 10x more productive, that should be reflected in compensation
- End employment law workarounds like “unlimited PTO,” where your PTO is still limited in practice, but it’s not a defined or accruing benefit
- Protection against dilution of equity for employees
- A seat at the table for workers, not just managers, in the event of layoffs
- Professional ethics and whistleblower protections. Legally-protected strikes if workers decide to refuse on pursuing an ethically or legally dubious product or feature.
I could go on. There are a lot of abuses we put up with because of relatively high salaries, and it is now abundantly clear that the billionaire capital-owning class is dead set on devaluing the work we do to “reduce labor costs.” We can decide not to go along with that.
So yes, please adopt our work ethic and legal framework. It's going to help us tremendously.
https://news.ycombinator.com/item?id=3101876
A long time ago, food took effort to find, and calories were expensive. Then we had a breakthrough in cost/per/calories. We got fat, because we can not moderate our food intake. It is killing us.
A long time ago, coding took effort, and programmer productivity was expensive. Then we had a breakthrough in cost/per/feature. Now we are exhausted, because we can not moderate our energy and attention expenditure. It is killing us.