Readit News logoReadit News
SirensOfTitan · 3 months ago
> Each of these 'phases' of LLM growth is unlocking a lot more developer productivity, for teams and developers that know how to harness it.

I still find myself incredibly skeptical LLM use is increasing productivity. Because AI reduces cognitive engagement with tasks, it feels to me like AI increases perceptive productivity but actually decreases it in many cases (and this probably compounds as AI-generated code piles up in a codebase, as there isn't an author who can attach context as to why decisions were made).

https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...

I realize the author qualified his or her statement with "know how to harness it," which feels like a cop-out I'm seeing an awful lot in recent explorations of AI's relationship with productivity. In my mind, like TikTok or online dating, AI is just another product motion toward comfort maximizing over all things, as cognitive engagement is difficult and not always pleasant. In a nutshell, it is another instant gratification product from tech.

That's not to say that I don't use AI, but I use it primarily as search to see what is out there. If I use it for coding at all, I tend to primarily use it for code review. Even when AI does a good job at implementation of a feature, unless I put in the cognitive engagement I typically put in during code review, its code feels alien to me and I feel uncomfortable merging it (and I employ similar levels of cognitive engagement during code reviews as I do while writing software).

enkrs · 3 months ago
I use LLMs (like claude-code and codex-cli) the same way accountants use calculators. Without one, you waste all your focus on adding numbers; with one, you just enter values and check if the result makes sense. Programming feels the same—without LLMs, I’m stuck on both big problems (architecture, performance) and small ones (variable names). With LLMs, I type what I want and get code back. I still think about whether it works long-term, but I don’t need to handle every little algorithm detail myself.

Of course there are going to be discussions what is real programming (like I'm sure there were discussions what is "real" accounting with the onset of a calculator)

The moment we stop treating LLMs like people and see them as big calculators, it all clicks.

janalsncm · 3 months ago
The issue with your analogy is that calculators do not hallucinate. They do not make mistakes. An accountant is able to fully offload the mental overhead of arithmetic because the calculator is reliable.
MangoToupe · 3 months ago
I'm assuming based on the granularity you're referring to autocomplete, and surely that already doesn't feel like dialup.
polotics · 3 months ago
My experience is exactly the opposite of "AI reduces cognitive engagement with tasks": I have to constantly be on my toes to follow what the LLMs are proposing and make sure they are not getting off track over-engineering things, or entering something that's likely to turn into a death loop several turns later. AI use definitely makes my brain run warmer, got to get a FLIR camera to prove it I guess...
walleeee · 3 months ago
So, reduces cognitive engagement with the actual task at hand, and forces a huge attention share to hand-holding.

I don't think you two are disagreeing.

I have noticed this personally. It's a lot like the fatigue one gets from too long scrolling online. Engagement is shallower but not any less mentally exhausting than reading a book. You end up feeling more exhausted due to the involuntary attention-scattering.

wildzzz · 3 months ago
I'm an electrical engineer. One of my jobs is maintaining a couple racks of equipment and the scripts we use to test hardware. I've never been expected to be a programmer beyond things like Matlab but over the past several years, I've been maintaining a python project we use to run these tests. With equipment upgrades and my amateur python skills, we now have a fully automated test, plug in the hardware, hit the green button, and wait for tests to complete and data to be validated. Codesurf absolutely chokes when trying to work on my code, it's just too much of a mess to handle. But I have been using our in-house chatgpt to write some utilities that I've been procrastinating on for years. Like I needed a debug tool to view live telemetry and send commands as required and have been procrastinating for a long time to write this. My existing scripts aren't flexible, they are literally just a script for the test runner to follow. I have an old debug tool but it's not compatible with the existing workflow so it's a pain to run. I told chatgpt what I needed, gave it some specs on the functions it would need from libraries I've written (but didn't want it to see), and it cranked out a perfectly functional python script. I ended up doing a bit of work on the script it gave me since I didn't trust it completely or knew if I could even get it to expand on the work properly. It would have taken me much longer to write on my own so I'm very grateful I could save so much time. Just last week, I had another idea for a different debug tool and did the same process (here's my idea, here's the specs, go) and after a few rounds of "can you add this next?", I had another quality tool ready to go with absolutely no touch-up work needed on my end. I want my tools to have simple Tkinter GUIs but I hate writing GUIs so I'm absolutely thrilled chatgpt can handle that for me.

I'm a bit of a luddite, I still just use notepad++ and a terminal window to develop my code. I don't want to get bogged down in using vscode so trusting AI to handle things beyond "can you make this email sound better?" has been a big leap for me.

lkey · 3 months ago
Echoing wallee,

This task: "I have to constantly be on my toes to follow what the LLMs are proposing"

and "understanding, then solving the specific problems you are being paid to solve"

are not the same thing. It's been linked endlessly here but Programming as Theory Building is as relevant today as it was in '85: https://pages.cs.wisc.edu/~remzi/Naur.pdf

add-sub-mul-div · 3 months ago
> I realize the author qualified his or her statement with "know how to harness it," which feels like a cop-out I'm seeing an awful lot in recent explorations of AI's relationship with productivity.

"You're doing AI wrong" is the new "you're doing agile wrong" which was the new "you're doing XP wrong".

pjmlp · 3 months ago
Unfortunely many of us are old enough to know how those wrong eventually became the new normal, the wrong way.
CuriouslyC · 3 months ago
At this point I don't even care to argue with people who disagree with me on this. History will decide the winner and if you think you're not on the losing side, best of luck.
SideburnsOfDoom · 3 months ago
You mean "It can’t be that stupid, you must be prompting it wrong"
bitwize · 3 months ago
More like the new "you're holding it wrong"
Art9681 · 3 months ago
It shifts your cognitive tasks to other things. Like every tool. The tool itself is an abstraction over tedium. We built it for a reason. You will spend less time thinking about some things, and more time thinking about others.

In that regard, nothing will change.

breakfastduck · 3 months ago
It depends what environment you're operating within.

I've used LLMs for code gen at work as well as for personal stuff.

At work primarily for quick and dirty internal UIs / tools / CLIs it's been fantastic, but we've not unleashed it on our core codebases. It's worth noting all the stuff we've got out of out are things we'd not normally have the time to work on - so a net positive there.

Outside of work I've built some bespoke software almost entirely generated with human tweaks here and there - again, super useful software for me and some friends to use for planning and managing music events we put on that I'd never normally have the time to build.

So in those ways I see it as massively increasing productivity - to build lower stakes things that would normally just never get done due to lack of time.

phil21 · 3 months ago
I do wonder about the second order effects of the second bit.

A lot of open source tooling gets created to solve those random "silly" things that are personal annoyances or needs. Then you find out others have the same or similar problem and entire standard tooling or libraries come into existence.

I have pontificated on how easy access to immediate "one offs" will kill this idea exchange? Instead of one tool maintained by hundreds to fulfill a common need, we will end up with millions of one-off LLM generated that are not shared with anyone else.

Might be a net win, or a net loss. I'm really not sure!

athrowaway3z · 3 months ago
Talking about a specific set-up they use isn't the goal of the post, so I don't think it's a cop out.

"How to harness it" is very clearly the difference between users right now, and I'd say we're currently bottom heavy with poor users stuck in a 'productivity illusion'.

But there is the question of "what is productivity?"

I'm finding myself (having AI) writing better structured docs and tests to make sure the AI can do what I ask it to.

I suspect that turns into compounding interests (or lack of technical debt).

For an industry where devs have complained, for decades, about productivity metrics being extremely hard or outright bullshit, I see way too many of the same people now waving around studies regarding productivity.

vel0city · 3 months ago
Isn't most technology generally about "product motion toward comfort maximizing over all things"? Isn't a can opener "comfort maximizing"? Bicycles comfort maximize over walking the same distance and leave more time for leisure. We use tractors to till the soil because tilling it by dragging plows by hand or by livestock was far less comfortable and led less time for other pursuits.

"AI" might be a good thing or a bad thing especially depending on the context and implementation, but just generally saying something is about maximizing comfort as some inherently bad goal seems off to me.

tptacek · 3 months ago
It depends a lot on how you use it, and how much effort you put into getting a knack for using it (which is rough because you're always worried that knack might be out of date within a month or two).

I use Claude, Codex, and the Gemini CLI (all "supervised command line agents"). I write Go. I am certain that agents improve my productivity in a couple common scenarios:

1. Unsticking myself from stalling at the start of some big "lift" (like starting a new project, adding a major feature that will pull in some new dependency). The LLM can get things very wrong (this happens to me maybe 20% of the time), but that doesn't matter, because for me the motion of taking something wrong and making it "righter" is much less effort than getting past a blank page, assembling all the resources I need to know how to hello-world a new dependency, that kind of thing. Along with "wrestling for weeks with timing-dependent bugs", this kind of inertia is one of like two principal components of "effort" in programming for me, so clearing it is a very big deal. I'm left at a point where I'm actually jazzed about taking the wheel and delivering the functionality myself.

2. Large mechanical changes (like moving an interface from one component to another, or major non-architectural refactors, or instrumentation). Things where there's one meta-solution and it's going to be repeated many times across a codebase. Easy to review, a lot of typing, no more kidding myself for 20 minutes that I can make just the right Emacs macro to pull it off.

3. Bug hunting. To be clear: I am talking here about code I wrote, not the LLM. I run it, it does something stupid. My first instinct now is to drop into Claude or Gemini, paste the logs and an @-reference to a .go file as a starting point, and just say "wtf". My hit rate on it spotting things is very good. If the hit rate was even "meh borderline" that would be a huge win for the amount of effort it takes, but it isn't, it's much better.

I'm guessing a lot of people do not have these 3 experiences with LLM agents. I'm sorry! But I do think that if you stipulate that I'm not just making this up, it's hard to go from here to "I'm kidding myself about the value this is providing". Note that these are three cases that look nothing at all like vibe-coding.

defatigable · 3 months ago
This matches my experience exactly. #3 is the one I've found most surprising, and it can work outside the context of just analyzing your own code. For example I found a case where an automated system we use started failing due to syntax changes, despite no code changes on our part. I gave Claude the error message and the context that we had made no code changes, and it immediately and correctly identified the root cause as a version bump from an unpinned dependency (whoops) that introduced breaking syntax changes. The version bump had happened four hours prior.

Could I have found this bug as quickly as Claude? Sure, in retrospect the cause seems quite obvious. But I could just as easily rabbit holed myself looking somewhere else, or taken a while to figure out exactly which dependency caused the issue.

It's definitely the case that you cannot blindly accept the LLM's output, you have to treat it as a partner and often guide it towards better solutions. But it absolutely can improve your productivity.

lupusreal · 3 months ago
I agree with these points and would add another: writing code that I don't strictly need and would normally avoid writing at all simply because it's easier to take the lazy route and do without it. This morning while on the shitter, I had claude code use dbus to interface with tumbler for generating thumbnails. My application doesn't neeed thumbnails and normally my reaction would be "Dbus? Eww. I'll just do that later (never)" But five minutes of Claude Code churning got the job done.
zaptheimpaler · 3 months ago
My experience with getting it to write code has not been good so far. Today I had a pretty mechanical task in Go. Basically some but not all functions from a package moved into another package so I asked Gemini to just change instances of pkg.Foo to pkgnew.Foo and import pkgnew in those files. It just got stuck on one file for 2 minutes and at that point I was already halfway through find/replacing it on my own.

For me it's been somewhat useful to ask questions but always fails at modifying any code in a big codebase.

naasking · 3 months ago
> 3. Bug hunting.

Agreed, and also configuration debugging. Treat the LLM as interactive documentation for libraries, frameworks, etc. that you may be using. They're great for solving problems in getting systems up and running, and they explain things better than the documentation because it's specific to your scenario.

theshrike79 · 3 months ago
With LLM assistance I managed to pinpoint an esoteric Unity issue to within 5 lines of code.

I've had one 3-day basic course in Unity, but I know how to prompt and guide an AI.

srcreigh · 3 months ago
> We do not provide evidence that:

> AI systems do not currently speed up many or most software developers

> AI systems in the near future will not speed up developers in our exact setting

> There are not ways of using existing AI systems more effectively to achieve positive speedup in our exact setting

zahlman · 3 months ago
> increases perceptive productivity

increases the perception of productivity?

Fire-Dragon-DoL · 3 months ago
I agree with you but have no metrics yet. It does help as a rubber duck though
dist-epoch · 3 months ago
> AI is just another product motion toward comfort maximizing over all things, as cognitive engagement is difficult and not always pleasant. In a nutshell, it is another instant gratification product from tech.

For me is the exact opposite. When not using AI, while coding you notice various things that could be improved, you can think about the architecture and what features you want next.

But AI codes so fast, that it's a real struggle keeping up to it. I feel like I need to focus 10 times harder to be able to think about features/architecture in a way that AI doesn't wait after me most of the time.

protocolture · 3 months ago
>Because AI reduces cognitive engagement with tasks

Just a sideways thing. Cognitive offloading is something humans do with each other quite a lot. People offload onto colleagues, underlings and spouses all the time.

People engage with AI through the prism of reducing engagement with the thing they dont like, and increasing engagement with the thing they do like.

It isnt a straight up productivity boost, but its more like going from being a screenwriter to a director.

naasking · 3 months ago
> and this probably compounds as AI-generated code piles up in a codebase, as there isn't an author who can attach context as to why decisions were made

I don't see why. If anything there's more opportunity for documentation because the code was generated from a natural language prompt that documents exactly what was generated, and often why. Recording prompts in source control and linked to the generated code is probably the way to go here.

bryanlarsen · 3 months ago
Multiple contexts is hard, and often counter-productive. It used to be popular on HN to talk about keeping your "flow", and railing against everything that broke a programmer's flow. These slow AI's constantly break flow.
solumunus · 3 months ago
The new flow is cycling between 5 concurrent agent sessions while watching YouTube.
theshrike79 · 3 months ago
And my ADHD brain loves it :D

Except no YouTube, I just watch my shows off Plex.

handfuloflight · 3 months ago
as they say, "few"
hamdingers · 3 months ago
The ideal for me as a flow-seeker is the quick-edit feature of Void and presumably other editors. It saves me the context switching to google or docs to find the right syntax and method names for what I want to express, without requiring I waste my time waiting for the LLM to figure out what I already know (the what and the where).
bryanlarsen · 3 months ago
The auto complete provided by your lsp server to your editor does a much better job of this than LLM's do, in my opinion.
wahnfrieden · 3 months ago
Flow sacredness made sense when we could only do our work serially and juggling tasks just meant switching around one thing at a time. Now we can parallelize more of our activities, so it's time to reevaluate the old wisdom.
zahlman · 3 months ago
I disagree. Nothing has changed about how human brains work. Try to read something while listening to something else at the same time, and see how much you absorb, for example. We can still, in large part, only do our work serially.
ryanmcgarvey · 3 months ago
The only reason I want these things to be any smarter is because I need them to do more work over longer periods screwing up. The only reason I need them to do more work over long periods is because they are too slow to properly pair with.

If I could have it read more of my project in a single gulp and produce the 10-1000 lines of code I want in a few seconds, I wouldn't need it to go off and write the thousands of lines on its own in the background. But because even trivial changes can take minutes by the time it slurps up the right context and futzes with the linter and types, that ideal pair programmer loop is less attractive.

joz1-k · 3 months ago
From the article: Anthropic has been suffering from pretty terrible reliability problems.

In the past, factories used to shut down when there was a shortage of coal for steam engines or when the electricity supply failed. In the future, programmers will have factory holidays when their AI-coding language model is down.

corentin88 · 3 months ago
Same as GitHub or Slack downtimes severely impact productivity.
thw_9a83c · 3 months ago
I would argue that dependency on GitHub and Slack is not the same as dependency on AI coding agents. GitHub/Slack are just straightforward tools. You can run them locally or have similar emergency backup tools ready to run locally. But depending on AI agents is like relying on external brains that have knowledge you suddenly don't have if they disappear. Moreover, how many companies could afford to run these models locally? Some of those models aren't even open.
bongodongobob · 3 months ago
This joke is as old as typewriters.
catigula · 3 months ago
>in the future

>programmers

Don't Look Up

ActionHank · 3 months ago
There are still people who dictate their emails to a secretary.

Technology changes, people often don't.

Programmers will be around for a longer time than anyone realises because most people don't understand how the magic box works let alone the arcane magics that run on it.

infecto · 3 months ago
Cursor imo is still one of the only real players in the space. I don’t like the claude code style of coding, I feel too disconnected. Cursor is the right balance for me and it is generally pretty darn quick and I only expect it to get quicker. I hope there are more players that pop up in this space.
solumunus · 3 months ago
Wild to me. When I switched from Cursor to Claude it only took me a day to realise that as things stand I would never use Cursor again.
infecto · 3 months ago
We will all have different experiences and workflow but I am not sure why it’s wild. For myself I find tools like Claude code or codex have a place but it’s not me using the tool interactively. They are both too slow in the feedback loop and overly verbose that it’s hard at least for me to establish a good cadence for writing code.
vinnymac · 3 months ago
I am also surprised by this. Especially because we can just run Claude Code from anywhere. Cursor, VS Code, Zed, emacs, vim, JetBrains, etc.

Cursor CLI, Codex and Gemini work too, but lag slightly behind in a variety of different ways that matter.

And if you think you’re getting better visual feedback through Cursor it’s likely you’re just not using Claude with your IDE correctly.

sealeck · 3 months ago
Have you tried https://zed.dev ?
dmix · 3 months ago
How is the pricing? I see it says "500 prompts a month" and only Claude. Cursor is built around token usage and distributes them across multiple models when you hit limits on one which turns out to be pretty economical.
infecto · 3 months ago
Yea and for some reason it was not my cup of tea. I think partly due to their paid version feels like an afterthought.
CuriouslyC · 3 months ago
You're going to have to get used to feeling disconnected if you want to stay in the game, that's the direction this is heading (and fast). You need to move up the ladder.

Also, cursor is both overpriced and very mediocre as an agent. Codex is the way to go.

gngoo · 3 months ago
That is just a personal opinion, not a fact. Either option can be faster or more productive if it suits your personal coding style. I work with both, I also favor one. But money is not exactly an issue.
infecto · 3 months ago
I get you have a financial incentive to say that but at least back it up. I do believe using ai tooling is here and now and a worthwhile endeavor but in my view we have not settled best practices yet and it depends on the individual preferences right now.

Tools are for us to figure out what works and what does not. Saying be prepared to be disconnected sounds like slop by someone getting forced into someone else’s idea.

If someone has a great workflow using a tool like codex that’s great but it does not mean it has to work for me. I love using codex for code reviews, testing and other changes that are independent of each other, like bugs. I don’t like using it for feature work, I have spent years building software and I am not going to twiddle my thumbs waiting for codex on something I am building real time. Now I think there is an argument that if you have the perfect blueprint of what to build that you could leverage a tool like codex but I am often not in that position.

mmmllm · 3 months ago
Speed is not a problem for me. I feel they are at the right speed now where I am able to see what it is doing in real time and check it's on the right track.

Honestly if it were any faster I would want a feature to slow it down, as I often intervene if it's going in the wrong direction.

howmayiannoyyou · 3 months ago
I expected to see OpenAI, Google, Anthropic, etc. provide desktop applications with integrated local utility models and sandboxed MCP functionality to reduce unnecessary token and task flow, and I still expect this to occur at some point.

The biggest long-term risk to the AI giant's profitability will be increasingly capable desktop GPU and CPU capability combined with improving performance by local models.

mordymoop · 3 months ago
From experience it seems like preempting context scoping and routing decisions to smaller models just results in those models making bad judgements at a very high speed.

Whenever I experiment with agent frameworks that spawn subagents with scoped subtasks and restricted context, things go off the rails very quickly. A subagent with reduced context makes poorer choices and hallucinates assumptions about the greater codebase, and very often lacks a basic sense the point of the work. This lack of situational awareness is where you are most likely to encounter js scripts suddenly appearing in your Python repo.

I don’t know if there is a “fix” for this or if I even want one. Perhaps the solution, in the limit, actually will be to just make the big-smart models faster and faster, so they can chew on the biggest and most comprehensive context possible, and use those exclusively.

eta: The big models have gotten better and better at longer-running tasks because they are less likely to make a stupid mistake that derails the work at any given moment. More nines of reliability, etc. By introducing dumber models into this workflow, and restricting the context that you feed to the big models, you are pushing things back in the wrong direction.

casey2 · 3 months ago
Yup. I expected a google LLM to coordinate with many local expert LLMs with knowledge of local tools and other domain expert LLMs in the cloud.

I they don't see a viable path forward without specialty hardware

mark_l_watson · 3 months ago
I rate the value of LLM based AI on how it improves me personally. Positive experiences include being able to more easily. read scientific papers with an AI filling in details for embedded math, and after a vibe coding session when I feel like I understand the problem better because I was engaged with the process and I understand the resulting code.

Negative experiences include times when I am lazy, turn off my brain and just accept LLM output. I am also a little skeptical about automating email handling, etc. that is cool technology but how useful is it, really? I can imagine insiders talking between themselves saying "feel the bubble!" but when talking with reporters they talk like "feel the AGI" or "oh no, our AI tech is so strong it will take over the world" - excellent strategies for pumping stock prices and valuations.