Readit News logoReadit News
simonw · 5 months ago
That METR paper - discussed here two weeks ago: https://news.ycombinator.com/item?id=44522772 - has so much sticking power.

It's not a bad paper, but it's also turning into a fantastic illustration of how much thirst there is out there for anything that shows that AI productivity doesn't work.

I just learned there was a 4 minute TV news segment about it on CNBC! https://www.youtube.com/watch?v=WP4Ird7jZoA

timr · 5 months ago
> It's not a bad paper, but it's also turning into a fantastic illustration of how much thirst there is out there for anything that shows that AI productivity doesn't work.

Maybe. I think it's a fantastic illustration of anyone doing anything to provide something other than hype around the subject. An actual RCT? Too good to be true! The thirst is for fact vs. speculation, influencer blogposts and self-promotion.

That this RCT provided evidence opposing the hype, is, of course, irresistible.

simonw · 5 months ago
That's what I liked about the paper: the methodology felt better to me than many other studies. I noted that when I blogged about it (second-to-last paragraph https://simonwillison.net/2025/Jul/12/ai-open-source-product... )
ofjcihen · 5 months ago
To be fair that thirst probably comes from people who aren’t seeing the gains the hype would lead you to believe and are reaching into the void to not feel like they’re taking crazy pills.

It’s also probably not coming from a place of “I’m scared of AI so I want it to fail” but more like “my complex use case doesn’t work with AI and I’m really wondering why that is”.

There’s this desire it seems to think of people who aren’t on the hype train as “against” AI but people need to remember that these are most likely devs with a decade of experience who have been evaluating the usefulness of the tools they use for a long time.

jsbisviewtiful · 5 months ago
Personal take, but I think some people also understand that the hype machine around AI is coming from the rich and C-level people. Meanwhile companies are widely and openly axing jobs and or not paying artists, citing AI as the source of their new fortune. Personally, my use of AI in my job has so far not been that fruitful and for something that has so far dramatically underdelivered on its promises of utopian ideas we are instead actively seeing it used to undercut the 99% - and that’s not even getting into the environmental impact or the hellscape it’s made of the Internet.
Yoric · 5 months ago
> To be fair that thirst probably comes from people who aren’t seeing the gains the hype would lead you to believe and are reaching into the void to not feel like they’re taking crazy pills.

Yes, exactly!

I've spent way too much time trying to get anything remotely close to an LLM writing useful code. Yeah, I'm sure it can speed up writing code that I can write in my sleep, but I want it to write code I can learn from, and so far, my success rate is ~0 (although the documentation along the bogus code is sometimes a good starting point).

Having my timelines filled by people who basically claim that I'm just an idiot for failing to achieve that? Yeah, it's craze-inducing.

Every time I see research that appears to confirm the hype, I see a huge hole in the protocol.

Now finally, some research confirming my observations? It feels so good!

Terr_ · 5 months ago
Trying out an analogy:

The time is the early 2000s, and the Segway™ is being suggested as the archetype of almost all future personal transportation in cities and suburbs. I don't hate the product, there's neat technology there, they're fun to mess with, but... My bullshit sensor is still going off.

I become tired of being told that I'm just not using enough imagination, or that I would understand if only I was plugged into the correct social-groups of visionaries who've given arguments I already don't find compelling.

Then when somebody does a proper analysis of start/stop distance, road throughput, cargo capacity, etc, that's awesome! Finally, some glimmer of knowledge to push back the fog of speculation.

Sure, there's a nonzero amount of confirmation bias going on, but goshdangit at least I'm getting mine from studies with math, rather than the folks getting it from artistic renderings of self-balancing vehicles filling a street in the year 2025.

goalieca · 5 months ago
For some, it is also coming from a place that their company leadership is mandating AI use.
emp17344 · 5 months ago
Well, there aren’t any studies showing AI agents boost productivity, so it’s all we’ve got. It seems like a well-conducted study, so I’m inclined to trust its conclusions.
simonw · 5 months ago
One of the articles linked from the OP includes links to such studies: https://theconversation.com/does-ai-actually-boost-productiv... - scroll down to the "AI and individual productivity" section, there are two papers there on the "increases productivity" side followed by two others that didn't.
bluefirebrand · 5 months ago
Anyone who is an employee drawing a salary should be extremely hopeful that AI productivity doesn't work.

Why should we be eager to find out that some new tech is going to undercut us and replace us, devaluing us even more than we already are?

simonw · 5 months ago
Do you benefit from open source?

Open source packages are the biggest productivity boost of my entire career, at no point did I think "wow, I wish these didn't exist, they're a threat to my livelihood".

kbelder · 5 months ago
Should we be wary of any productivity gains?

Should be looking for ways to work slower? I can go back to just one monitor.

Deleted Comment

ajaisjdh · 5 months ago
There’s literally billions of dollars on the “pro AI side”.

What you’re seeing is a thirst for objective reporting. The average person only has the ability to provide anecdotes - many of which are in stark contrast to the narrative pushed by the billionaires pumping AI.

I don’t think anyone serious thinks AI isn’t useful in some capacity - but it’s more like a bloom filter than a new branch of mathematics. Magically powerful in specific use cases, but not a paradigm shift.

didibus · 5 months ago
Personally, I think this is a disingenuous take. The thirst is for tangible data, the issue is that we've never been able to measure any form of productivity/quality in software development.

My team does two person PR reviews for example. We'd go a lot faster if we didn't or even just allowed a single reviewer. Similarly, we have no idea what the quality impact would be if we stopped, and what we gain by doing so. We are we not having a 3 reviewer rule for example, why not, two is an arbitrary number?

Unit tests... We'd surely go a lot faster if we didn't bother with them. Teams used to have some dedicated QA members and you'd rely entirely on manual testing. You can push a lot more code out. Was software in the 90s when unit tests and integ tests wasn't used buggier than today's software?

Now take AI, what is the impact of its use? It's not even obvious if it reduced the time it takes to launch a feature, my team isn't suddenly ahead of schedule on all our projects, even though we all use Agentic tools actively now. Ask any one of us and "I think it makes us faster" will be the answer. But ask us why we have a 2 person review rule and we'd similarly say: "I think it prevents bugs and improves the code quality".

The difference with AI now is that you pay for it, it's not free. Having unit tests or doing a 2 person review is just a process change. AI is something you pay for, so there's more desire to know for sure. And it also is something people would like to know if they can lower their headcount without impacting their competitive edge and ability to deliver fast and with good enough quality. Nobody wants to lower the headcount and find out the hard way.

thewebguyd · 5 months ago
> the issue is that we've never been able to measure any form of productivity/quality in software development.

Yep. It's been a "problem" for decades at this point. Business types constantly trying, and failing, to find some way to measure dev productivity, like they can with other types of office drone work.

We've been through Lines of Code, Function Points, various agile metrics, etc. None of these have given business types their holy grail of a perfectly objective measure of productivity. But no one wants to accept an answer of "You just can't effectively measure productivity in software development" because we now live in a data-driven business culture where every little thing must be measured and quantified.

qsort · 5 months ago
I don't think AI lives up to the current hype but this article is garbage.

They're obviously talking about the METR paper, but the main takeaway according to the authors themselves was that self-reporting productivity increases is unreliable, not that you should cancel your subscription.

Nothing in that paper said that AI can't speed up software engineering.

Why are we responding to hype with nonsense?

didibus · 5 months ago
> Nothing in that paper said that AI can't speed up software engineering

I mean, the paper did provide tangible data that at least in their experiments, AI slowed down software engineering.

What they said is that it's not a proof that there's isn't a scenario or a mechanism where AI could result in speeding up software engineering. For that more research would be needed in measuring productivity of AI in more varied contexts.

For me at least, their experiment seem to describe the average developer's use of AI. So it's probably telling you that currently on average AI might be slowing things down.

Now the question is, can we find good data of outliers, and is it a simple matter of figuring out how to use it effectively, so we can upskill people and get the average to now be faster. Or will the outlier be conditioned on like, only for newbies, only for prototypes, only for the first X weeks on a greenfield code base, etc.

Edit: That said, the most fascinating data point of that study is how software engineers are not able to determine if AI makes them faster or slower, because they all thought they were 20% faster but were 19% slower in reality. So now you have to become really skeptical of anyone who claims they found a methodology or a workflow where their use of AI makes them faster. We need better measurement than just "I feel faster".

nayshins · 5 months ago
I'm happy to let people think that AI does not yield productivity gains. There is no point engaging on this topic, so I will just outwork/outperform them.
quxbar · 5 months ago
I now have the pleasure of giving exercises to candidates where they are explicitly allowed to use any AI or autocomplete that they want, but it's one of those tricky real-world problems where you'll only get yourself into trouble if you only follow the model's suggestions. It really separates the builders from the bureaucrats far more effectively than seeing who can whiteboard or leetcode.
jamil7 · 5 months ago
Its kind of a trap, we allow people in interviews to do the same and some of them waste so much time accepting wrong LLM completions and then changing them than if they'd just written the code themselves.
pydry · 5 months ago
Ive been doing this inadvertently for years by making tasks that were as realistic as possible - explicitly based upon the code the candidate will be working upon.

As it happens, this meant when candidates started throwing AI at the task, instead of performing that magic it usually can when you make it build a todo app or solve some done-to-death irrelevant leetcode problem it flailed and left the candidate feeling embarrassed.

I really hope AI signals the death knell of fucking stupid interview problems like leetcode. Alas many companies are instead knee jerking and "banning" AI from interview use instead (even claude, hilariously).

BoiledCabbage · 5 months ago
> but it's one of those tricky real-world problems where you'll only get yourself into trouble if you only follow the model's suggestions.

What's the goal of this? What are you looking for?

yomismoaqui · 5 months ago
That's really interesting... can you give more details about the problem you are using?

This sounds like in there will be a race between this kind of booby trap tests and AIs learning them.

elpakal · 5 months ago
Some code challenge platforms allow for seeing how often someone pasted things in. That's been interesting.
arealaccount · 5 months ago
Interesting, care to elaborate? Or this is a carefully guarded secret?
Lionga · 5 months ago
If you are so happy to let people think that AI does not yield productivity gains, why comment here?

How exactly did you outperform? Show, don't talk.

nayshins · 5 months ago
I rolled out a migration to 60+ backends by using Claude code to manage it in the background. Simultaneously, I worked on other features while keeping my usual meeting load. I have more commits and releases per week than I have had in my whole career, which is objectively more productive.
haswell · 5 months ago
The issue I have with comments like this one is the one-dimensional notion of value described as "productivity gains" for a single person.

There are many things in this world that could be fairly described as "more productive" or "faster" than the norm, yet few people would argue that it makes those things a net benefit. You can lie and cheat your way to success, and that tends to be successful too. There are good reasons society frowns on this.

To me, focusing only on "I'm more productive" while ignoring the systemic and societal factors impacted by that "productivity" is completely missing the forest for the trees.

The fact that you further feel that there isn't even a point in engaging on the topic is disturbing considering those ignored factors.

troupo · 5 months ago
> I'm happy to let people think that AI does not yield productivity gains.

vs.

--- start quote ---

In a randomised controlled trial – the first of its kind – experienced computer programmers could use AI tools to help them write code.

--- end quote ---

Your quote is very representative of the magical wishful thinking most people have about AI: https://dmitriid.com/everything-around-llms-is-still-magical...

simonw · 5 months ago
"Your quote is very representative of the magical wishful thinking most people have about AI"

Your comment here is very representative of how quickly people who are AI skeptics will jump on anything that supports their skepticism.

refulgentis · 5 months ago
(not op)

Gosh, I was conflicted, then you pulled out that sentence and I was convinced. :)

Alternatively: When faced with a contradiction, first, check your premises.

I don't want to belabor the point too much, there's little common ground if we're at all or nothing thinking - "the study proved AI is net-negative because of this pull quote" isn't discussion.

pydry · 5 months ago
ive watched a lot of people code with cursor, etc. and i noticed that they seem to get a rush when it occasionally does something amazing that more than offsets their disappointment when it (more often) screws up.

the psychological effect reminds me a bit of slot machines, which provide you with enough intermittent wins to make you feel like you're winning while youre lose.

I think this might be linked to that study that found experienced oss devs who thought they were faster when they were in actual fact 20% slower.

hooverd · 5 months ago
Crazy how productivity gains just lead to more work for you.
stronglikedan · 5 months ago
Crazy how people would let their managers know they could get more done, instead of getting the same amount done quicker and having more free time.

Deleted Comment

nayshins · 5 months ago
more work is good for the soul... until it isnt
bluefirebrand · 5 months ago
This is actually worth talking about imo

There is nothing in it for me, if I am more productive but earn the same and don't get any more time off

Why should I bother at that point?

skeeter2020 · 5 months ago
>> so I will just outwork/outperform them.

actually based on your own admission this is not what you're doing...

throwawayqqq11 · 5 months ago
Green or naturally grown brown field projects?

People who boast about AI enhanced productivity seem to always forget to mention.

thefz · 5 months ago
Until you will not have access to it and be outperformed by people used to thinking every day.
masfuerte · 5 months ago
Yet here you are, engaging.
nayshins · 5 months ago
the art of the bait
kubb · 5 months ago
Good luck getting paid more for your improved performance :D
some_random · 5 months ago
If they're right in their belief that AI usage leads to significantly more performance, their compensation is that they will keep their job.
stronglikedan · 5 months ago
No one gets paid more to get the same job done, unless you count free time as compensation.
nayshins · 5 months ago
I get paid a lot already.
archagon · 5 months ago
Uh, and who are you, exactly?
gishglish · 5 months ago
> so I will just outwork/outperform them.

At the game of producing garbage slop? Probably yeah.

spaceman_2020 · 5 months ago
All AI impact studies and research papers need to be taken with a pinch of salt. The field is moving so fast that by the time you get peer reviewed, you’re already outdated

I’ve watched coding change from Cursor-esque IDEs to terminal based agentic tools within months.

djhn · 5 months ago
Which also means spending a lot of time learning AI tools gets wasted as the tooling improves.

I still suspect the vast silent majority of professional software devs haven’t integrated any, even Cursor-style, AI tools in to their main gig.

And I reckon that’s completely rational, for those that have made this choice explicitly.

Early adopters of AI tools are making a speculative bet, but so far most of them seem happy with the return.

elicash · 5 months ago
I'm building things at a level of complexity I wouldn't have even attempted without AI.

This piece, however, only focuses on time spent on a task that could be done both ways. Even there, it falls short. Let's assume this study is correct and a specific coding task does take me 19% more time with AI. I can still be more productive because the AI doing some of the work allows me to do other tasks during that time.

I do worry about atrophy of my mind outsourcing too many tasks, admittedly. But that's a different issue.

hattmall · 5 months ago
There is really no argument that AI creates some productivity gains. Even if it's just an improved autocomplete. Because autocomplete does create some productivity gains. Pushing farther is murkier though. When it comes to the bulk of the type of work that AI is proving useful for, one of the main questions is why is there a need for speed. It's not in a sense of fear of automating jobs, it's just that generally we are reaching the bottlenecks more quickly and potentially causing more, but different problems, than we are solving.

It's similar to the story of the development of vehicles and how even though we move much faster we spend a greater amount of time in transit. My mom used to lament how annoying it was to have to drive to the grocery store because when she was younger and not everyone had cars the store came to you. Twice a day, in the morning and the evening the "rolling store" would drive through the neighborhood and if they didn't have what you needed right then, they would bring it on the next trip. We are finally coming back full circle with things like Instacart but it's taken a solid ~60 years of largely wasted inefficient travel times.

asdev · 5 months ago
I think AI greatly reduces the starting costs for mundane tasks/boilerplate, for a reduction in velocity in implementation. So possibly an illusion that programmers feel more productive. It could be that RPE(Rate of perceived exertion) is lower when using AI for tasks, but raw throughput may be higher if programmers just do the jobs themselves and get into productive/flow state.
thewebguyd · 5 months ago
> t could be that RPE(Rate of perceived exertion) is lower when using AI for tasks, but raw throughput may be higher if programmers just do the jobs themselves and get into productive/flow state.

I think you're onto something with this take, based on my own experience. I definitely agree that my RPE seems lower when I'm using AI for things, whether it actually is making me more productive or not over the long term remains to be seen but things do certainly "feel" easier/less cognitively demanding. Which, tbh, is still a benefit even if it doesn't result in large gains in output. Putting in less cognitive load at work just conserves my energy for things that matter - everything else outside of $dayjob.

nromiun · 5 months ago
So much money is being pumped into this whole thing that we might get a global economic shock if/when it unravels.
kubb · 5 months ago
That won't happen - this money was stolen from the middle class via currency devaluation. Spending it on any economic activity, no matter how pointless, is actually better than just gobbling up assets.
nromiun · 5 months ago
An economic recession is not selective. It will affect all of us.