Labor market impacts of AI: A new measure and early evidence

People who are saying they're not seeing productivity boost, can you please share where is it failing?

Because, I am terrified by the output I am getting while working on huge legacy codebases, it works. I described one of my workflow changes here: https://news.ycombinator.com/item?id=47271168 but in general compared to old way of working I am saving half of the steps consistently, whether its researching the codebase, or integrating new things, or even making fixes. I have stopped writing code, occasionally I jump into the changes proposed by LLM and make manual edits if it is feasible, otherwise I revert changes and ask it to generate again but based on my learnings from the past rejected output

I am terrified about what's coming

yoyohello13 · 8 days ago

The companies laying off people have no vision. My company is a successful not for profit and we are hiring like crazy. It’s not a software company, but we have always effectively unlimited work. Why would anyone downsize because work is getting done faster? Just do more work, get more done, get better than the competition, get better at delivering your vision. We put profits back in the community and actually make life better for people. What a crazy fucking concept right?

tkgally · 8 days ago

I suspect it depends partly on how locked each individual is into a particular type of work, both skill-wise and temperamentally.

To give an example from a field where LLMs started causing employment worries earlier than software development: translation. Some translators made their living doing the equivalent of routine, repetitive coding tasks: translating patents, manuals, text strings for localized software, etc. Some of that work was already threatened by pre-LLM machine translation, despite its poor quality; context-aware LLMs have pretty much taken over the rest. Translators who were specialized in that type of work and too old or inflexible to move into other areas were hurt badly.

The potential demand for translation between languages has always been immense, and until the past few years only a tiny portion of that demand was being met. Now that translation is practically free, much more of that demand is being met, though not always well. Few people using an app or browser extension to translate between languages have much sense of what makes a good translation or of how translation can go bad. Professional translators who are able to apply their higher-level knowledge and language skills to facilitate intercultural communication in various ways can still make good money. But it requires a mindset change that can be difficult.

afro88 · 8 days ago

This is exactly right IMO. I have never worked for a company where the bottleneck was "we've run out of things to do". That said, plenty of companies run out of actual software engineering work when their product isn't competitive. But it usually isn't competitive because they haven't been able to move fast enough

ehnto · 8 days ago

That was my insight also. As a manager, you already have the headcount approved, and your people just allegedly got some significant percentage more productive. The first thought shouldn't be, great let's cut costs, it should be great now we finally have the bandwidth to deliver faster.

On a macro level, if you were in a rising economic tide, you would still be hiring, and turning those productivity gains into more business.

I wonder what the parallels are to past automations. When part producing companies moved from manual mills to CNC mills, did they fire a bunch of people or did they make more parts?

throw3847r7 · 8 days ago

You need certain company culture, to be able to scale up, and to capture this value. Most companies can not just add new developers.

AI needs documentation, automation, integration tests... It works very well for remote first company, but not for in-face informal grinding approach.

Just year ago, client told me to delete integration tests, because "they ran too long"!

anthonypasq · 7 days ago

most businesses dont actually have an infinite amount of work that has extremely high ROI. every new project at google for example has to justify the engineering spend of developing a product that has comparable margin to the ad business. Why spend 10 million a year of engineering resources on a new product that might 1. completely fail or 2. be a decent product with 20% margins when they could do nothing and keep raking in 90% margins from the ads business.

pllbnk · 5 days ago

I have a hope that many of today's engineers working at various companies will start realizing that instead of employers receiving tools to get them laid off, it is they, the engineers, who received the tools to compete with said employers and outcompete them.

If, and it's a big if, AI models really boost productivity by an order of magnitude (I personally, while being skeptical a year or two ago, am leaning towards this idea) then engineers have a chance to realize their ideas, improve current system design patterns and build successful companies, which will inevitably (hopefully) require hiring personnel to keep competing, bringing entire software engineering market to a newly balanced state.

RA_Fisher · 7 days ago

Does that extra work bring in more revenue? I think that’s the key question.

crocowhile · 8 days ago

Because hiring less while getting more done increases margins. Your company is not for profit so doesnt care about margins. Others do.

apercu · 7 days ago

I think a lot of companies have ineffective ways to measure productivity, poor management (e.g., people who were IC's then promoted to management but have no management training or experience), incentives aren't necessarily aligned between orgs and staff, so people end up with a perverse "more headcount" means I'm better than Sandy over there. Leadership and vision have been rare in my professional life (though the corporate-owned media celebrates mediocrity in leadership all the time with puff pieces).

Once you get to a certain size company, this means a lot of bloat. Heck, I've seen small(ish) companies that had as many managers and administrators as ICs.

But You're not wrong, I'm just pointing out how an org that has 4k people can lay off a few hundred with modest impact of the financials (though extensive impact on morale).

MattGaiser · 8 days ago

You would need to expand your capacity to find and define the work. I imagine that would be a major challenge.

threatofrain · 8 days ago

These are words without weights. At some point the put money into software option will max out. Perhaps what we should all be doing is hiring more lawyers, there's always more legal work to be done. When you don't have weights then you can reason like this.

arwhatever · 8 days ago

I’ve been screaming this too https://news.ycombinator.com/item?id=47212237

It’s refreshing to see the same sentiment from so many other people independently here.

zipy124 · 7 days ago

The problem becomes if you are a service like Youtube, where you already have capture almost the entire customer base.

svara · 7 days ago

Yes, it's the lump of labor fallacy.

Doesn't exclude the possibility of short term distribution, though.

throwaw12 · 8 days ago

> Just do more work, get more done

That's one of the reasons why I am terrified, because it can lead to burn out, and I personally don't like to babysit bunch of agents, because the output doesn't feel "mine", when its not "mine" I don't feel ownership.

And I am deliberately hitting the brake from time to time not to increase expectations, because I feel like driving someone else's car while not understanding fully how they tuned their car (even though I did those tunings by prompting)

sdf2df · 7 days ago

Stop talking sense bro, you'll get downvoted.

If you look at my post history I'm essentially saying the same stuff lol.

rich_sasha · 7 days ago

I find LLMs are good at essentially boilerplate code. It's clear what to do and it needs to be typed in. Or areas where I really have no idea where to start, because I'm not familiar with the codebase.

I find anything else, I spend more time coaxing them into doing 85% of what I need that I'm better off doing it myself.

So they're not useless but there's only so many times in a week that I need a function to pretty-print a table in some fashion. And the code they write on anything more complex than a snippet is usually written poorly enough that it's a write-once-never-touch-again situation. If the code needs to be solid, maintainable, testable, correct (and these are kind of minimal requirements in my book) then LLMs make little impact on my productivity.

They're still an improvement on Google and Stack exchange, but again - only gets you so far.

YMMV

vividfrier · 7 days ago

> I find anything else, I spend more time coaxing them into doing 85% of what I need that I'm better off doing it myself.

You must be working in a very niche field with very niche functionality if that's the case? I work at a company just outside of FAANG and I work in compliance. Not a terribly complex domain but very complicated scale and data integrity requirements.

I haven't written a single line of code manually in 2 weeks. Opus 4.6 just... works. Even if I don't give it all the context it just seems to figure things out. Occasionally it'll make an architectural error because it doesn't quite understand how the microservices interact. But these are non-trivial errors (i.e. humans could have made them as well) and when we identify such an error, we update the team-shared CLAUDE.md to make sure future agents don't repeat the error.

joenot443 · 7 days ago

> I spend more time coaxing them into doing 85% of what I need that I'm better off doing it myself

What was the last thing you built in which you felt this was the case?

mirsadm · 7 days ago

I have an app which is fairly popular. This release cycle I used Claude Code and codex to implement all the changes / features. It definitely let me move much quicker than before.

However now that it's in the beta stage the amount of issues and bugs is insane. I reviewed a lot of the code that went in as well. I suspect the bug fixing stage is going to take longer than the initial implementation. There are so many issues and my mental model of the codebase has severely degraded.

It was an interesting experiment but I don't think I would do it again this way.

aurareturn · 7 days ago

The last 10% takes up 90% of the time. Usually, the time is spent finding issues you didn't even know about. This was true before LLMs.

johsole · 7 days ago

I make mistakes when writing code, but I know what types of mistakes I make. With AI it's like a coworker who makes mistakes, sometimes they're obvious to me and sometimes they're not, because I make different mistakes.

truetraveller · 7 days ago

Thanks for the insight!

sdf2df · 7 days ago

"There are so many issues and my mental model of the codebase has severely degraded."

Not only that, the less coding you do in general? Guess what, fixing issues that in the past wouldve been a doddle (muscle memory) become less harder due to atrophy.

Swear most people dont think straight and cant see the obvious.

sdf2df · 7 days ago

Congrats. Now post this more often so the bozo's who downvote posts that push-against pro-LLM stuff f-off.

I came to the same conclusion when producing a video with Grok. Did the job but utterly painful and it was definitely very costly - I used 50 free-trial accounts and maxed them out each day for a month.

Im pretty sure these conclusions hold across all models and therefore the technology by extension.

maplethorpe · 7 days ago

Rather than trying to fix the bugs yourself, have you tried asking Claude to fix them for you?

jpollock · 8 days ago

The last time I tried AI, I tested it with a stopwatch.

The group used feature flags...

    if (a) {
       // new code
    } else {
       // old code
    }

    void testOff() {
       disableFlag(a);
       // test it still works
    }
    
    void testOn() {
        enableFlag(a);
        // test it still works
    }

However, as with any cleanup, it doesn't happen. We have thousands of these things lying around taking up space. I thought "I can give this to the AI, it won't get bored or complain."

I can do one flag in ~3minutes. Code edit, pr prepped and sent.

The AI can do one in 10mins, but I couldn't look away. It kept trying to use find/grep to search through a huge repo to find symbols (instead of the MCP service).

Then it ignored instructions and didn't clean up one or the other test, left unused fields or parameters and generally made a mess.

Finally, I needed to review and fix the results, taking another 3-5 minutes, with no guarantee that it compiled.

At that point, a task that takes me 3 minutes has taken me 15.

Sure, it made code changes, and felt "cool", but it cost the company 5x the cost of not using the AI (before considering the token cost).

Even worse, the CI/CD system couldn't keep up the my individual velocity of cleaning these up, using an automated tool? Yeah, not going to be pleasant.

However, I need to try again, everyone's saying there was a step change in December.

laserlight · 7 days ago

I did my own experiment with Claude Code vs Cursor tab completion. The task was to convert an Excel file to a structured format. Nothing fancy at all.

Claude Code took 4 hours, with multiple prompts. At the end, it started to break the previous fixes in favor of new features. The code was spaghetti. There was no way I could fix it myself or steer Claude Code into fixing it the right way. Either it was a dead-end or a dice roll with every prompt.

Then I implemented my own version with Cursor tab completion. It took the same amount of time, 4 hours. The code had a clear object-oriented architecture, with a structure for evolution. Adding a new feature didn't require any prompts at all.

As a result, Claude Code was worse in terms of productivity: the same amount of time, worse quality output, no possibility of (or at best very high cost of) code evolution.

sensanaty · 7 days ago

Similar happened to me just now. Claude whatever-is-the-latest-and-greatest, in Claude Code. I also tried out Windsurf's Arena Mode, with the same failure. To intercept the inevitable "holding it wrong" comments, we have all the AGENTS.md and RULES.md files and all the other snake oil you're told to include in the project. It has full context of the code, and even the ticket. It has very clear instructions on what to do (the kind of instructions I would trust an unpaid intern with, yet alone a tool marketed as the next coming of Cyber Jesus that we're paying for), in a chat with minimal context used up already. I manually review every command it runs, because I don't trust it running shell scripts unsupervised.

I wanted it to finish up some tests that I had already prefilled, basically all the AI had to do was convert my comments into the final assertions. A few minutes later of looping, I see it finishes and all tests are green.

A third of the tests were still unfilled, I guess left as an exercise for the reader. Another third was modified beyond what I told it to do, including hardcoding some things which made the test quite literally useless and the last third was fine, but because of all the miscellaneous changes it made I had to double check those anyways. This is about the bare minimum where I would expect these things to do good work, a simple take comment -> spit out the `assert()` block.

I ended up wasting more time arguing with it than if I had just done the menial task of filling out the tests myself. It sure did generate a shit ton of code though, and ran in an impressive looking loop for 5-10 minutes! And sure, the majority of the test cases were either not implemented or hardcoded so that they wouldn't actually catch a breakage, but it was all green!!

That's ultimately where this hype is leading us. It's a genuinely useful tool in some circumstances, but we've collectively lost the plot because untold billions have poured into these systems and we now have clueless managers and executives seeing "tests green -> code good" and making decisions based on that.

embedding-shape · 7 days ago

What model, what harness and about how long was your prompt to fire off this piece of work? All three matters a lot, but importantly missing from your experience.

Dead Comment

wasmainiac · 8 days ago

Because its failure rate is too high. Beyond boilerplate code and CRUD apps, if I let AI run freely on the projects I maintain, I spend more time fixing its changes than if I just did it myself. It hallucinates functionally, it designs itself into corners, it does not follow my instructions, it writes too much code for simple features.

It’s fine at replacing what stack overflow did nearly a decade ago, but that isn’t really an improvement from my baseline.

leptons · 8 days ago

That's my experience too. It's okay at a few things that save me some typing, but it isn't really going to do the hard work for me. I also still need to spend significant amounts of time figuring out what it did wrong and correcting it. And that's frustrating. I don't make those mistakes, and I really dislike being led down bad paths. If "code smells" are bad, then "AI" is a rotting corpse.

qudat · 7 days ago

It’s not that it just makes mistakes but it also implements things in ways I don’t like or are not relevant to the business requirements or scope of the feature / project.

I end up replacing any saved time with QA and code review and I really don’t see how that’s going to change.

In my mind I see Claude as a better search engine that understands code well enough to find answers and gain understanding faster. That’s about it.

apsurd · 8 days ago

AI dramatically increases velocity. But is velocity productivity? Productivity relative to which scope: you, the team, the department, the company?

The question is really, velocity _of what_?

I got this from a HN comment. It really hit for me because the default mentality for engineers is to build. The more you build the better. That's not "wrong" but in a business setting it is very much necessary but not sufficient. And so whenever we think about productivity, impact, velocity, whatever measure of output, the real question is _of what_? More code? More product surface area? That was never really the problem. In fact it makes life worse majority of the time.

mattmanser · 8 days ago

The real question is, is it increasing their velocity?

They've already admitted they just 'throw the code away and start again'.

I think we've got another victim of perceived productivity gains vs actual productivity drop.

People sitting around watching Claude churn out poor code at a slower rate than if they just wrote it themselves.

Don't get me wrong, great for getting you started or writing a little prototype.

But the code is bad, riddled with subtle bugs and if you're not rewriting it and shoving large amounts of AI code into your codebase, good luck in 6-12 months time.

tripledry · 7 days ago

Something I've been thinking about lately is if there is value in understanding the systems we produce and if we expected to?

If I can just vibe and shrug when someone asks why production is down globally then I'm sure the amount of features I can push out increases, but if I am still expected to understand and fix the systems I generate, I'm not convinced it's actually faster to vibe and then try to understand what's going on rather than thinking and writing.

In my experience the more I delegate to AI, the less I understand the results. The "slowness and thinking" might just be a feature not a bug, at times I feel that AI was simply the final straw that finally gave the nudge to lower standards.

joe_mamba · 7 days ago

>if I can just vibe and shrug when someone asks why production is down globally

You're pretty high up in the development, decision and value-addition chain, if YOU are the responsible go-to person for these questions. AI has no impact on your position.

kodablah · 8 days ago

> People who are saying they're not seeing productivity boost, can you please share where is it failing?

At review time.

There are simply too many software industries that can't delegate both authorship _and_ review to non-humans because the maintenance/use of such software, especially in libraries and backwards-compat-concerning environments, cannot justify an "ends justifies the means" approach (yet).

msvana · 8 days ago

I work as an ML engineer/researcher. When I implement a change in an experiment it usually takes at least an hour to get the results. I can use this time to implement a different experiment. Doesn't matter if I do it by hand or if I let an agent do it for me, I have enough time. Code isn't the bottleneck.

I also heard an opinion that since writing code is cheap, people implement things that have no economic value without really thinking it through.

apsurd · 8 days ago

+1 on the economic value line. Not everything needs to be about money but if you get paid to ship code it's about money. And now we have coworkers shipping insane amounts of "features" because it's all free to ship and being an engineer, it ends there.

Only it doesn't, there's product positioning, UX, information architecture, onboarding and training, support, QA, change management, analytics, reporting… sigh

kranke155 · 8 days ago

I work in commercials.

We can now make 1$ million dollar commercials with 100,000$ or less. So a 90% reduction in costs - if we use AI.

The issue is they don’t look great. AI isn’t that great at some key details.

But the agencies are really trying to push for it.

They think this is the way back to the big flashy commercials of old. Budgets are lower than ever, and shrinking.

Big issue here is really the misunderstanding of cause - budgets are lower, because advertising has changed in general (TV is less and less important ) and a lot of studies showed that advertising is actually not all that effective.

So they are grabbing onto a lifeboat. But I’m worried there’s no land.

I’ve planned my exit.

uxcolumbo · 8 days ago

Advertisement is not that effective in general or just for certain channels, i.e. TV?

Also what are you existing to?

exfalso · 7 days ago

It's failing when there is no data in the training set, and there are no patterns to replicate in the existing code base.

I can give you many, many examples of where it failed for me:

1. Efficient implementation of Union-Find: complete garbage result 2. Spark pipelines: mostly garbage 3. Fuzzer for testing something: half success, non-replicateable ("creative") part was garbage. 4. Confidential Computing (niche): complete garbage if starting from scratch, good at extracting existing abstractions and replicating existing code.

Where it succeeds: 1. SQL queries 2. Following more precise descriptions of what to do 3. Replicating existing code patterns

The pattern is very clear. Novel things, things that require deeper domain knowledge, coming up with the to-be-replicated patterns themselves, problems with little data don't work. Everything else works.

I believe the reason why there is a big split in the reception is because senior engineers work on problems that don't have existing solutions - LLMs are terrible at those. What they are missing is that the software and the methodology must be modified in order to make the LLM work. There are methodical ways to do this, but this shift in the industry is still in baby shoes, and we don't yet have a shared understanding of what this methodology is.

Personally I have very strong opinions on how this should be done. But I'm urging everyone to start thinking about it, perhaps even going as far as quitting if this isn't something people can pursue at their current job. The carnage is coming:/

belZaah · 8 days ago

I don’t think the objections are not necessarily in terms of lack of productivity although my personal experience is not that of massive productivity increases. The fact that you are producing code much faster is likely just to push the bottleneck somewhere else. Software value cycles are long and complicated. What if you run into an issue in 5 years the LLM fails to diagnose or fix due to complex system interactions? How often would that happen? Would it be feasible to just generate the whole thing anew matching functionality precisely? Are you making the right architecture choices from the perspective of what the preferred modus operandi of an llm is in 5 years? We don’t know. The more experienced folks tend to be conservative as they have experienced how badly things can age. Maybe this time it’ll be different?

dumfries · 7 days ago

"it works" is a very low standard when it comes to software engineering. Why are we not holding AI generated code to the same standards as we hold our peers during code reviews?

I have never heard anyone say "it works" as a positive thing when reviewing code..

Yes, there is a productivity boost but you can't tell me there is no decrease in quality

oytis · 8 days ago

> I have stopped writing code, occasionally I jump into the changes proposed by LLM and make manual edits if it is feasible, otherwise I revert changes and ask it to generate again but based on my learnings from the past rejected output

Isn't it a very inefficient way to learn things? Like, normally, you would learn how things work and then write the code, refining your knowledge while you are writing. Now you don't learn anything in advance, and only do so reluctantly when things break? In the end there is a codebase that no one knows how it works.

throwaw12 · 7 days ago

> Isn't it a very inefficient way to learn things?

It is. But there are 2 things:

1. Do I want to learn that? (if I am coming back to this topic again in 5 months, knowledge accumulates, but there is a temptation to finish the thing quickly, because it is so boring to swim in huge legacy codebase)

2. How long it takes to grasp it and implement the solution? If I can complete it with AI in 2 days vs on my own in 2 weeks, I probably do not want to spend too much time on this thing

as I mentioned in other comments, this is exactly makes me worried about future of the work I will be doing, because there is no attachment to the product in my brain, no mental models being built, no muscles trained, it feels someone else's "work", because it explores the code, it writes the code. I just judge it when I get a task

hobofan · 8 days ago

> you would learn how things work and then write the code

In a legacy codebase this may require learning a lot of things about how things work just to make small changes, which may be much less efficient.

pinkmuffinere · 8 days ago

I asked opus 4.6 how to administer an A/B test when data is sparse. My options are to look at conversion rate, look at revenue per customer, or something else. I will get about 10-20k samples, less than that will add to cart, less than that will begin checkout, and even less than that will convert. Opus says I should look at revenue per customers. I don't know the right answer, but I know it is not to look at revenue per customers -- that will have high variance due to outlier customers who put in a large order. To be fair, I do use opus frequently, and it often gives good enough answers. But you do have to be suspicious of its responses for important decisions.

Edit: Ha, and the report claims it's relatively good at business and finance...

Edit 2: After discussion in this thread, I went back to opus and asked it to link to articles about how to handle non-normally distributed data, and it actually did link to some useful articles, and an online calculator that I believe works for my data. So I'll eat some humble pie and say my initial take was at least partially wrong here. At the same time, it was important to know the correct question to ask, and honestly if it wasn't for this thread I'm not sure I would have gotten there.

onion2k · 8 days ago

A/B tests are a statistical tool, and outliers will mess with any statistical measure. If your data is especially prone to that you should be using something that accounts for them, and your prompt to Opus should tell it to account for that.

A good way to use AI is to treat it like a brilliant junior. It knows a lot about how things work in general but very little about your specific domain. If your data has a particular shape (e.g lots of orders with a few large orders as outliers) you have to tell it that to improve the results you get back.

gurghet · 7 days ago

Basically it tricks you into making the code less maintainable, so while it seems to boost productivity, it's just delaying huge failures.

rootusrootus · 7 days ago

Exactly this, IMO. We are in a race to see whether the mountain of technical debt that AI is creating grows faster than AI's ability to whittle it down later.

iugtmkbdfil834 · 8 days ago

I don't want to generalize from my specific situation too much, but I want to offer an anecdote from my neck of the woods. On my personal sub, I agree it is kinda crazy the kind of projects I can get into now with little to no prior knowledge.

On the other hand, our corporate AI is.. not great atm. It was briefly kinda decent and then suddenly it kinda degraded. Worst case is, no one is communicating with us so we don't know what was changed. It is possible companies are already trying to 'optimize'.

I know it is not exactly what you are asking. You are saying capability is there, but I am personally starting to see a crack in corporate willingness to spend.

staticassertion · 8 days ago

When it comes to novel work, LLMs become "fast typers" for me and little more. They accelerate testing phases but that's it. The bar for novelty isn't very high either - "make this specific system scale in a way that others won't" isn't a thing an LLM can ever do on its own, though it can be an aid.

LLMs also are quite bad for security. They can find simple bugs, but they don't find the really interesting ones that leverage "gap between mental model and implementation" or "combination of features and bugs" etc, which is where most of the interesting security work is imo.

asadm · 8 days ago

I think your analysis is a bit outdated these days or you may be holding it wrong.

I am doing novel work with codex but it does need some prompting ie. exploring possibilities from current codebase, adding papers to prompt etc.

For security, I think I generally start a new thread before committing to review from security pov.

gilbetron · 7 days ago

What was your take on this?

https://aisle.com/blog/what-ai-security-research-looks-like-...

truetraveller · 7 days ago

This is basically my take as well!

girvo · 7 days ago

Sometimes I’m scared.

Sometimes I realise that this particular task has been slower than if I’d done it myself when I take in to account full wall clock time.

I can’t tell what type of task is going to work ahead of time yet.

vividfrier · 7 days ago

Same. Whenever an article like this one pops up the comments seem to be filled with confirmation bias. People who don't see a productivity boost agree with the article.

I work at tech company just outside of big tech and I feel fairly confident that we won't have a need for the amount of developers we currently have within 3-4 years.

The bottleneck right now is reviewing and I think it's just a matter of time before our leadership removes the requirement for human code reviews (I am already seeing signs of this ("Maybe for code behind feature flags we don't need code reviews?").

Whenever there's an incident, there is a pagerduty trigger to an agent looking at the metrics, logs, software component graphs, and gives you an hypothesis on what the incident is due to. When I push a branch with test failures, I get one-click buttons in my PR to append commits fixing those tests failures (i.e. an agent analyses the code, the jira ticket, the tests, etc. and suggests a fix for the tests failing). We have a Slack agent we can ping in trivial feature requests (or bugs) in our support channels.

The agents are being integrated at every step. And it's not like the agents will stop improving. The difference between GPT3.5 and Opus 4.6 is so massive. So what will the models look like in 5 years from now?

We're cooked and the easiest way to tell someone works at a company who hasn't come very far in their AI journey is that they're not worried.

dataflow · 8 days ago

I feel like this might be heavily dependent on both your task and the AI you're using? What language do you code in and what AI do you use? And are your tasks pretty typical/boilerplate-y with prior art to go off of, or novel/at-the-edge-of-tech?

sivanmz · 8 days ago

It’s been my experience as of recently. I point it at an issue tracker and ask it to investigate, write a test to reproduce the problem and plan a fix together. There’s lots of hand holding from me but it saves me a lot of work and I’ve been surprised by its comfort with legacy code bases. For now I feel empowered, and I’m actually working more intensively, but I was wondering to myself if I’m going run out of work this year. Interestingly, our metrics show that output is slowed by increased workload on reviewers.

matheusmoreira · 7 days ago

Terrifying doesn't quite say it. The situation we're in is either we achieve a post-scarcity society or we'll all die.

aurareturn · 8 days ago

  People who are saying they're not seeing productivity boost, can you please share where is it failing?

Believe it or not, I still know many devs who do not use any agents. They're still using free ChatGPT copy and paste.

I'm going to guess that many people on HN are also on the "free ChatGPT isn't that good at programming" train.

dataflow · 8 days ago

Which one would you recommend as the best right now? Claude Code?

throwaw12 · 8 days ago

> They're still using free ChatGPT copy and paste

Probably that's the reason why some people are sure their job is still safe.

Nature of job is changing rapidly

salawat · 8 days ago

Not everyone has the capability to rent out data center tier hardware to just do their job. These things require so much damn compute you need some serious heft to actually daisy chain enough stages either in parallel or deep to get enough tokens/sec for the experience to go ham. If you're making bags o' coke money, and deciding to fund Altman's, Zuckernut's or Amazon/Google's/Microsoft's datacenter build out, that's on you. Rest of us are just trying to get by on bits and bobs we've kept limping over the years. If opencode is anything to judge the vibecoded scene by, I'm fairly sure at some point the vibe crowd will learn the lesson of isolating the most expensive computation ever from the hot loop, then maybe find one day all they needed was maybe something to let the model build a context, and a text editor.

Til then wtf_are_these_abstractions.jpg

boxedemp · 8 days ago

I'm with you. The project I'm working on is moving at phenomenal velocity. I'm basically spending my time writing specs and performing code reviews. As long as my code review comments and design docs are clear I get a secure, scalable, and resilient system.

Tests were always important, but now they are the gatekeepers to velocity.

motbus3 · 7 days ago

I think you can more stuff done earlier but the quality is not good or it doesn't work as expected if you tinker with it enough. Fixing the issues from the generated code usually doesn't work at all

RandomLensman · 8 days ago

Outside of coding/non-physical areas, the impact can be quite muted. I haven't seen much impact on surgical procedures, for example (but maybe others have?).

KronisLV · 8 days ago

I’m currently working across like 5 projects (was 4 last week but you know how it is). I now do more in days than others might in a week.

Yesterday a colleague didn’t quite manage to implement a loading container with a Vue directive instead of DOM hacks, it was easier for me to just throw AI at the problem and produced a working and tested solution and developer docs than to have a similarly long meeting and have them iterate for hours.

Then I got back to training a CNN to recognize crops from space (ploughing and mowing will need to be estimated alongside inference, since no markers in training data but can look at BSI changes for example), deployed a new version of an Ollama/OpenAI/Anthropic proxy that can work with AWS Bedrock and updated the docs site instructions, deployed a new app that will have a standup bot and on-demand AI code review (LiteLLM and Django) and am working on codegen to migrate some Oracle forms that have been stagnating otherwise.

It’s not funny how overworked I am and sure I still have to babysit parallel Claude Code sessions and sometimes test things manually and write out changes, but this is a completely different work compared to two or three years ago.

Maybe the problem spaces I’m dealing with are nothing novel, but I assume most devs are like that - and I’d be surprised at people’s productivity not increasing.

When people nag in meetings about needing to change something in a codebase, or not knowing how to implement something and its value add, I’ll often have something working shortly after the meeting is over (due to starting during it).

Instead of sending adding Vitest to the backlog graveyard, I had it integrated and running in one or two evenings with about 1200 tests (and fixed some bugs). Instead of talking about hypothetical Oxlint and Oxfmt performance improvements, I had both benchmarked against ESLint and Prettier within the hour.

Same for making server config changes with Ansible that I previously didn’t due to additional friction - it is mostly just gone (as long as I allow some free time planned in case things vet fucked up and I need to fix them).

Edit: oh and in my free time I built a Whisper + VLM + LLM pipeline based on OpenVINO so that I can feed it hours long stream VODs and get an EDL cut to desired length that I can then import in DaVinci Resolve and work on video editing after the first basic editing prepass is done (also PyScene detect and some audio alignment to prevent bad cuts). And then I integrated it with subscription Claude Code, not just LiteLLM and cloud providers with per-token costs for the actual cuts making part (scene description and audio transcriptions stay local since those don't need a complex LLM, but can use cloud for cuts).

Oh and I'm moving from my Contabo VPSes to running stuff inside of a Hetzner Server Auction server that now has Proxmox and VMs in that, except this time around I'm moving over to Ansible for managing it instead of manual scripts as well, and also I'm migrating over from Docker Swarm to regular Docker Compose + Tailscale networks (maybe Headscale later) and also using more upstream containers where needed instead of trying to build all of mine myself, since storage isn't a problem and consistency isn't that important. At the same time I also migrated from Drone CI to Woodpecker CI and from Nexus to Gitea Packages, since I'm already using Gitea and since Nexus is a maintenance burden.

If this becomes the new “normal” in regards to everyone’s productivity though, there will be an insane amount of burnout and devaluation of work.

Karrot_Kream · 8 days ago

> When people nag in meetings about needing to change something in a codebase, or not knowing how to implement something and its value add, I’ll often have something working shortly after the meeting is over (due to starting during it).

We've started building harnesses to allow people who don't understand code to create PRs to implement their little nags. We rely on an engineer to review, merge, and steward the change but it means that non-eng folks do not rely on us as a gate. (We're a startup and can't really afford "teams" to do this hand-holding and triage for us.)

As you say we're all a bit overworked and burned out. I've been context switching so much that on days when I'm very productive I've started just getting headaches. I'm achieving a lot more than before but holding the various threads in my head and context switching is just a lot.

leptons · 8 days ago

>I now do more in days than others might in a week.

I've always done more in days than others might in a week. YMMV.

drekipus · 7 days ago

> my job is easier now, I do less. > I am terrified about what's coming.

God I hope I never ever have to work with you

fulafel · 8 days ago

A terminology tangent because it's an econ publication: Notice that the article doesn't talk about productivity.

Productivity is a term of art in economics and means you generate more units of output (for example per person, per input, per wages paid) but doesn't take quality or otherwise desireability into account. It's best suited for commodities and industrial outputs (and maybe slop?).

randusername · 7 days ago

I see an individual productivity boost, but not necessarily a collective one.

I don't think features per hour is really what is holding back most established businesses.

My experiences suggest that we still have some time before the people that understand the plumbing of the business _and_ AI bubble up to positions of authority through wielding it practically and successfully at increasingly greater scale.

lm28469 · 7 days ago

Meanwhile gemini tells me my go code doesn't compile (it does)

Gaslight me by telling me I must be a time traveler because I use go 1.26 but the latest version actually is 1.24

And tell me I can't use wg.Go() because this function does not exist (it does)

truetraveller · 8 days ago

You were probably deficient in RESEARCH skills before. No offense to you, since I was also like this once. LLMs research and put the results on the plate. Yes, for people who were deficient in research skills, I can see 2-3x improvements.

Note1: I have "expert" level research skills. But LLMs still help me in research, but the boost is probably 1.2x max. But

Note2: By research, I mean googling, github search, forum search, etc. And quickly testing using jsfiddle/codepen, etc.

throwaw12 · 8 days ago

no worries, I do not get offended quickly.

But I also think you are overestimating your RESEARCH skills, even if you are very good at research, I am sure you can't read 25 files in parallel, summarize them (even if its missing some details) in 1 minute and then come up with somewhat working solution in the next 2 minutes.

I am pretty sure, humans can't comprehend reading 25 code files with each having at least 400 lines of non-boilerplate code in 2 minutes. LLM can do it and its very very good at summarizing.

I can even steer its summarizing skills by prompting where to focus on when its reading files (because now I can iterate 2-3 times for each RESEARCH task and improve my next attempt based on shortcomings in the previous attempt)

siva7 · 8 days ago

Ok Mr. Expert Level Researcher, go back and read the comment of parent again to find out that it has nothing to do with deficiency in research skills.

throwaw12 · 8 days ago

please don't change your comment constantly (or at maybe add UPDATE 1/2/3), because you had different words before, like you were saying something in this fashion:

* you probably lack good RESEARCH skills

* I can see at most 1.25x improvements - now it is 2-3x

By updating your comment you are making my reply irrelevant to your past response

therealdrag0 · 8 days ago

I can only explain it by people not having used Agentic tools and or only having tried it 9 months ago for a day before giving up or having such strict coding style preferences they burn time adjusting generated code to their preferences and blaming the AI even though they’re non-functional changes and they didn’t bother to encode them into rules.

The productivity gains are blatantly obvious at this point. Even in large distributed code bases. From jr to senior engineer.

MattGaiser · 8 days ago

I can see someone who is very particular about their way being the right way having issues with it. I’m very much the kind of person who believes that if I can’t write a failing test, I don’t have a very serious case. A lot of devs aren’t like that.

zozbot234 · 7 days ago

> I am terrified about what's coming

Why? This is great. AI fixing up huge legacy codebases is just taking the jobs humans would never want to do.

I don't write code for a living but I administer and maintain it.

Every time I say this people get really angry, but: so far AI has had almost no impact on my job. Neither my dev team nor my vendors are getting me software faster than they were two years ago. Docker had a bigger impact on the pipeline to me than AI has.

Maybe this will change, but until it does I'm mostly watching bemusedly.

kdheiwns · 8 days ago

Yep. All AI has done for me is give me the power of how good search engines were 10+ years ago, where I could search for something and find actually relevant and helpful info quickly.

I've seen lots of people say AI can basically code a project for them. Maybe it can, but that seems to heavily depend on the field. Other than boilerplate code or very generic projects, it's a step above useless imo when it comes to gamedev. It's about as useful as a guy who read some documentation for an engine a couple years ago and kind of remembers it but not quite and makes lots of mistakes. The best it can do is point me in the general direction I need to go, but it'll hallucinate basic functions and mess up any sort of logic.

kranner · 8 days ago

My experience is the same. There are modest gains compensating for lack of good documentation and the like, but the human bottlenecks in the process aren't useless bureaucracy. Whether or not a feature or a particular UX implementation of it makes sense, these things can't be skipped, sped up or handed off to any AI.

bee_rider · 8 days ago

Thinking of it, I haven’t seen as many “copy paste from StackOverflow” memes lately. Maybe LLMs have given people the ability to

1) Do that inside their IDEs, which is less funny

2) Generate blog post about it instead of memes

mchaver · 7 days ago

> All AI has done for me is give me the power of how good search engines were 10+ years ago

So the good old days before search engines were drowning with ads and dark patterns. My assumption is big LLMs will go in the same direction after market capture is complete and they need to start turning a profit. If we are lucky the open source models can keep up.

throwaw12 · 8 days ago

> how good search engines were 10+ years ago

For me this is a huge boost in productivity. If I remember how I was working in the past (example of Google integration):

Before:

    * go through docs to understand how to start (quick start) and things to know
    * start boilerplate (e.g. install the scripts/libs)
    * figure out configs to enable in GCP console
    * integrate basic API and test
    * of course it fails, because its Google API, so difficult to work with
    * along the way figure out why Python lib is failing to install, oh version mismatch, ohh gcc not installed, ohh libffmpeg is required,...
    * somehow copy paste and integrate first basic API
    * prepare for production, ohhh production requires different type of Auth flow
    * deploy, redeploy, fix, deploy, redeploy
    * 3 days later -> finally hello world is working

Now:

    * Hey my LLM buddy, I want to integrate Google API, where do I start, come up with a plan
    * Enable things which requires manual intervention
    * In the meantime LLM integrates the code, install lib, asks me to approve installation of libpg, libffmpeg,....
    * test, if fails, feed the error back to LLM + prompt to fix it
    * deploy

redhed · 7 days ago

What language/engine did you try it with for gamedev? Just curious if it was weak in a popular engine.

demorro · 7 days ago

It makes me wonder if the majority of all-in on AI folks are quite young and never experienced pre-enshittification search.

rhubarbtree · 7 days ago

Are you using Claude Opus 4.5/6?

If not, then you’re not close to the cutting edge.

thewebguyd · 8 days ago

Same here, more or less, in the ops world. Yeah, I use AI but I can't honestly say it's massively improved my productivity or drastically changed my job in any way other than the emails I get from the other managers at my work are now clearly written by AI.

I can turn out some scripts a little bit quicker, or find an answer to something a little quicker than googling, but I'm still waiting on others most of the time, the overall company processes haven't improved or gotten more efficient. The same blockers as always still exist.

Like you said, there has been other tech that has changed my job over time more than AI has. The move to the cloud, Docker, Terraform, Ansible, etc. have all had far more of an impact on my job. I see literally zero change in the output of others, both internally and externally.

So either this is a massively overblown bubble, or I'm just missing something.

linsomniac · 8 days ago

You're missing something.

I've been in ops for 30 years, Claude Code has changed how I work. Ops-related scripting seems to be a real sweet spot for the LLMs, especially as they tend to be smaller tools working together. It can convert a few sentences into working code in 15-30 minutes while you do something else. I've given it access to my apache logs Elastic cluster, and it does a great job at analyzing them ("We suspect this user has been compromised, can you find evidence of that?"). It's quite startling, actually, what it's able to do.

keeda · 8 days ago

> ... but I'm still waiting on others most of the time, the overall company processes haven't improved or gotten more efficient. The same blockers as always still exist.

And that's the key problem, isn't it? I maintain current organizations have the "wrong shape" to fully leverage AI. Imagine instead of the scope of your current ownership, you own everything your team or your whole department owns. Consider what that would do to the meetings and dependencies and processes and tickets and blockers and other bureaucracy, something I call "Conway Overhead."

Now imagine that playing out across multiple roles, i.e. you also take on product and design. Imagine what that would do to your company org chart.

I added a much more detailed comment here: https://news.ycombinator.com/item?id=47270142

tayo42 · 8 days ago

Ops hasn't been in the crosshairs of Ai yet.

Imo it's only a matter of time as companies start to figure out how to use ai. Companies don't seem to have real plans yet and everyone is figuring out ai in general out.

Soon though I will think agents start popping up, things like first line response to pages, executing automation

Deleted Comment

sdf2df · 8 days ago

Youre not missing anything.

Humans are funny. But most cant seem to understand that the tool is a mirage and they are putting false expectations on it. E.g. management of firms cutting back on hiring under the expectation that LLMs will do magic - with many cheering 'this is the worst itll be bro!!".

I just hope more people realise before Anthropic and OAI can IPO. I would wager they are in the process of cleaning up their financials for it.

httpz · 8 days ago

This is a classic case of Productivity Paradox when personal computers were first introduced into workplaces in the 80s.

A famous economist once said, "You can see the computer age everywhere but in the productivity statistics."

There are many reasons for the lag in productivity gain but it certainly will come.

https://en.wikipedia.org/wiki/Productivity_paradox

bandrami · 8 days ago

That's only certain if investments in tech infrastructure always led to productivity increases. But sometimes they just don't. Lots of firms spent a lot of money on blockchain five years ago, for instance, and that money is just gone now.

danbolt · 8 days ago

My unfounded hunch for the computing bit is that home computers became more and more commonplace in the home as we approached the 21st century.

A Commodore 64 was a cool gadget, but “the family computer” became a device that commoditized the productivity. The opportunity cost of applying a computer to try something new went to near zero.

It might have been harder for someone to improve the productivity of an old factory in Shreveport, Louisiana with a computer than it was for the upstarts at id to make Doom.

kranner · 8 days ago

> There are many reasons for the lag in productivity gain but it certainly will come.

Predictions without a deadline are unfalsifiable.

fnordpiglet · 8 days ago

My employer is pretty advanced in its use of these tools for development and it’s absolutely accelerated everything we do to the point we are exhausting roadmaps for six months in a few weeks. However I think very few companies are operating like this yet. It takes time for tools and techniques to make it out and Claude code alone isn’t enough. They are basically planning to let go of most of the product managers and Eng managers, and I expect they’re measuring who is using the AI tools most effectively and everyone else will be let go, likely before years end. Unlike prior iterations I saw at Salesforce this time I am convinced they’re actually going to do it and pull it off. This is the biggest change I’ve seen in my 35 year career, and I have to say I’m pretty excited to be going through it even though the collateral damage will be immense to peoples lives. I plan to retire after this as well, I think this part is sort of interesting but I can see clearly what comes next is not.

p1esk · 8 days ago

I’m observing very similar trends at a startup I’m at. Unfortunately I’m not ready to retire yet.

blackcatsec · 8 days ago

Why are you excited for this? They’re not going to give YOU those peoples’ salaries. You will get none of it. In fact, it will drag your salary through the floor because of all the available talent.

bandrami · 8 days ago

The dev team is committing more than they used to. A lot, in fact, judging from the logs. But it's not showing up as a faster cadence of getting me software to administer. Again, maybe that will change.

righthand · 8 days ago

In my experience it is now twice the amount of merge requests as a follow-up appears to correct any bugs no one reviewed in the first merge request.

whateveracct · 8 days ago

I think they feel more productive but aren't actually.

steve_adams_86 · 7 days ago

In my org I get far more done than ever, but I also find it more exhausting.

Because I can get so much done, I've lost my sense for what's enough. And if I can squeeze out a bit more relatively easily, why wouldn't I? When do I hit the brakes?

There are some tasks where LLMs are not all that helpful, and I find myself kind of savoring those tasks.

I'm surprised you don't notice a difference. Where I work it has been transformative. Perhaps it's because we're relatively small and scrappy, so the change in pace is easier with less organizational inertia. We've dramatically changed processes and increased outputs without a loss in quality. For less experienced programmers who are more interested in simple scripts for processing data, their outputs are actually far better, and they're learning faster because the Claude Code UI exposes them to so many techniques in the shell. I now see people using bash pipes for basic operations who wouldn't have known a thing about bash a couple years ago. The other day a couple less-technical people came to me to learn about what tests are. They never would have been motivated to learn this before. It's really cool.

It doesn't reduce work at all, though. We're an under-funded NGO with high ambition. These changes allow us to do more with the same funding. Hopefully it allows us to get more funding, too. I can't see it leading to anyone being let go; we need every brain we can get.

eucyclos · 8 days ago

A tool with a mediocre level of skill in everything looks mediocre when the backdrop is our own area of expertise and game changing when the backdrop is an unfamiliar one. But I suspect the real game changer will be that everyone is suddenly a polymath.

sibeliuss · 8 days ago

This ^ Exactly it. This will be the change.

redhale · 7 days ago

I don't doubt your sincerity. But this represents an absolutely bonkers disparity compared to the reality I'm experiencing.

I'm not sure what to say. It's like someone claiming that automobiles don't improve personal mobility. There are a lot of logical reasons to be against the mass adoption of automobiles, but "lack of effectiveness as a form of personal mobility" is not one of them.

Hearing things like this does give me a little hope though, as I think it means the total collapse of the software engineering industry is probably still a few years away, if so many companies are still so far behind the curve.

tasuki · 7 days ago

> It's like someone claiming that automobiles don't improve personal mobility.

I prefer walking or cycling and often walk about 8km a day around town, for both mobility and exercise. (Other people's) automobiles make my experience worse, not better.

I'm sure there's an analogy somewhere.

(Sure, automobiles improve the speed of mobility, if that's the only thing you care about...)

bandrami · 7 days ago

I don't think I'm asking for something unreasonable: I'll believe this actually speeds up software creation when one of my vendors starts getting me software faster. That's not some crazy ludditism on my part, I don't think?

lovich · 8 days ago

> so far AI has had almost no impact on my job.

Are you hiring?

cute_boi · 8 days ago

My friend used to say that, and he got quietly fired and outsourced because now someone in India can use ChatGPT to produce similar code, lol.

IMO AI will make 70-80% job obsolete for sure.

LPisGood · 8 days ago

My company has been hiring a ton over the last year or so. Jobs are out there

Kye · 8 days ago

I've taken to calling LLMs processors. A "Hello World" in assembly is about 20 lines and on par with most unskilled prompting. It took a while to get from there to Rust, or Firefox, or 1T parameter transformers running on powerful vector processors. We're a notch past Hello World with this processor.

The specific way it applies to your specific situation, if it exists, either hasn't been found or hasn't made its way to you. It really is early days.

ygouzerh · 7 days ago

I feel that it differs a lot between companies. It seems like corporate are having less an impact for now, as external innovation needs tailoring to adapt to its needs (e.g a security solution that needs 3 month projects to be tailored to the company tech stack), whereas startups and smaller firms see the most of the impact so far.

willmadden · 8 days ago

Build a new feature. If you aren't bogged down in bureaucracy it will happen much faster.

YesBox · 8 days ago

I dont use LLMs much. When I do, the experience always feels like search 2.0. Information at your fingertips. But you need to know exactly what you're looking for to get exactly what you need. The more complicated the problem, the more fractal / divergent outcomes there are. (Im forming the opinion that this is going to be the real limitations of LLMs).

I recently used copilot.com to help solve a tricky problem for me (which uses GPT 5.1):

   I have an arbitrary width rectangle that needs to be broken into smaller 
   random width rectangles (maintaining depth) within a given min/max range.

The first solution merged the remainder (if less than min) into the last rectangle created (regardless if it exceeded the max).

So I poked the machine.

The next result used dynamic programming and generated every possible output combination. With a sufficiently large (yet small) rectangle, this is a factorial explosion and stalled the software.

So I poked the machine.

I realized this problem was essentially finding the distinct multisets of numbers that sum to some value. The next result used dynamic programming and only calculated the distinct sets (order is ignored). That way I could choose a random width from the set and then remove that value. (The LLM did not suggest this). However, even this was slow with a large enough rectangle.

So I poked my brain.

I realized I could start off with a greedy solution: Choose a random width within range, subtract from remaining width. Once remaining width is small enough, use dynamic programming. Then I had to handle the edges cases (no sets, when it's okay to break the rules.. etc)

So the LLMs are useful, but this took 2-3 hours IIRC (thinking, implementation, testing in an environment). Pretty sure I would have landed on a solution within the same time frame. Probably greedy with back tracking to force-fit the output.

bandrami · 8 days ago

Most of these are new features, but then they have to integrate with the existing software so it's not really greenfield. (Not to mention that our clients aren't getting any faster at approving new features, either.)

sdf2df · 8 days ago

Its this kind of thinking that tells me people cant be trusted with their comments on here re. "Omg I can produce code faster and it'll do this and that".

No simply 'producing a feature' aint it bud. That's one piece of the puzzle.

rhubarbtree · 7 days ago

This is not a good thing. If you’re not being exposed and skilling up already, you’re likely to be in the camp that is washed away.

If you can’t be exposed to it in your day job, start using Claude opus in the evening so you know what’s coming.

matkoniecz · 7 days ago

So far I have not seen much skill gain from using LLM extensively.

Maybe I will be replaced by matrix multiplication in my job, but if I need to use LLM at some point I expect little benefit from starting now.

Yes, I tried to use Claude Code two months ago. It was scary, but not useful.

bandrami · 7 days ago

Eh, people have been warning me "you'll be left behind!" about the flavor of the week for decades now and it hasn't happened yet. If it happens it happens.

Deleted Comment

sdf2df · 8 days ago

I will personally say right now... its not gonna change lol.

People who actually know how to think can see it a mile away.

stevenhuang · 8 days ago

It's telling you feel the need to create a throw away to voice this opinion.