Readit News logoReadit News
steinnes commented on Claude Code is being dumbed down?   symmetrybreak.ing/blog/cl... · Posted by u/WXLCKNO
bcherny · a day ago
Hey, Boris from the Claude Code team here. I wanted to take a sec to explain the context for this change.

One of the hard things about building a product on an LLM is that the model frequently changes underneath you. Since we introduced Claude Code almost a year ago, Claude has gotten more intelligent, it runs for longer periods of time, and it is able to more agentically use more tools. This is one of the magical things about building on models, and also one of the things that makes it very hard. There's always a feeling that the model is outpacing what any given product is able to offer (ie. product overhang). We try very hard to keep up, and to deliver a UX that lets people experience the model in a way that is raw and low level, and maximally useful at the same time.

In particular, as agent trajectories get longer, the average conversation has more and more tool calls. When we released Claude Code, Sonnet 3.5 was able to run unattended for less than 30 seconds at a time before going off the rails; now, Opus 4.6 1-shots much of my code, often running for minutes, hours, and days at a time.

The amount of output this generates can quickly become overwhelming in a terminal, and is something we hear often from users. Terminals give us relatively few pixels to play with; they have a single font size; colors are not uniformly supported; in some terminal emulators, rendering is extremely slow. We want to make sure every user has a good experience, no matter what terminal they are using. This is important to us, because we want Claude Code to work everywhere, on any terminal, any OS, any environment.

Users give the model a prompt, and don't want to drown in a sea of log output in order to pick out what matters: specific tool calls, file edits, and so on, depending on the use case. From a design POV, this is a balance: we want to show you the most relevant information, while giving you a way to see more details when useful (ie. progressive disclosure). Over time, as the model continues to get more capable -- so trajectories become more correct on average -- and as conversations become even longer, we need to manage the amount of information we present in the default view to keep it from feeling overwhelming.

When we started Claude Code, it was just a few of us using it. Now, a large number of engineers rely on Claude Code to get their work done every day. We can no longer design for ourselves, and we rely heavily on community feedback to co-design the right experience. We cannot build the right things without that feedback. Yoshi rightly called out that often this iteration happens in the open. In this case in particular, we approached it intentionally, and dogfooded it internally for over a month to get the UX just right before releasing it; this resulted in an experience that most users preferred.

But we missed the mark for a subset of our users. To improve it, I went back and forth in the issue to understand what issues people were hitting with the new design, and shipped multiple rounds of changes to arrive at a good UX. We've built in the open in this way before, eg. when we iterated on the spinner UX, the todos tool UX, and for many other areas. We always want to hear from users so that we can make the product better.

The specific remaining issue Yoshi called out is reasonable. PR incoming in the next release to improve subagent output (I should have responded to the issue earlier, that's my miss).

Yoshi and others -- please keep the feedback coming. We want to hear it, and we genuinely want to improve the product in a way that gives great defaults for the majority of users, while being extremely hackable and customizable for everyone else.

steinnes · a day ago
I can’t count how many times I benefitted from seeing the files Claude was reading, to understand how I could interrupt and give it a little more context… saving thousands of tokens and sparing the context window. I must be in the minority of users who preferred seeing the actual files. I love claude code, but some of the recent updates seem like they’re making it harder for me to see what’s happening.. I agree with the author that verbose mode isn’t the answer. Seems to me this should be configurable
steinnes commented on Calculus Made Easy (1910)   calculusmadeeasy.org/... · Posted by u/fortran77
cmpb · 5 years ago
I was a math tutor in college, and many of the tutees would come into the tutor lab completely downtrodden, dreading their homework. I found the most helpful thing I could do with them was to focus more on getting them to understand the motivation behind what they had gone over in class that day, and in particular saying things like "now why would we do this" as soon as I could see in their face that it was clicking. Getting them to understand the why (and getting them to feel like they understood the why) was incredibly effective for helping them feel less insecure when going to class.
steinnes · 5 years ago
When I was taking calculus classes, and struggling, a tutor helped me immensely in the exact same way; explaining both practical applications and often the historical context around the method he was teaching me.
steinnes commented on IBM acquires Red Hat   redhat.com/en/blog/red-ha... · Posted by u/nopriorarrests
skybrian · 7 years ago
This is rather normal, not pathological. Most companies like to file patents, definitely including startups. Where do you think their patents come from, if not from work their employees do? It's work that's worth paying for.

Having some proprietary code doesn't prevent companies from also making substantial contributions to open source in other areas.

steinnes · 7 years ago

  Having some proprietary code doesn't prevent companies from also making substantial contributions to open source in other areas.

... yes, but patents are not "proprietary code", in which case granting copyright is enough. If a private company contributes code I think it could even grant the copyright to an open organization, but still decide later to sue if the method/feature/function/etc is patented by them.

Not that I'd foresee IBM or any of the real players in the IT space attempting that anymore, I think most realize that alienating the F/OSS community isn't a viable strategy for a software services company in the long term.

steinnes commented on Containers vs. Zones vs. Jails vs. VMs   blog.jessfraz.com/post/co... · Posted by u/adamnemecek
dkersten · 9 years ago
Is rebuilding and redeploying a container really any different from rebuilding and redeploying statically linked binaries?
steinnes · 9 years ago
For a lot of applications: no, it's very similar, and if you have a language that can be easily statically compiled to a binary which is free of external dependencies and independently testable, and you've setup a build-test-deployment pipeline relying on that, then perhaps in your case containers are a solution in search of a problem :-)

But there are more benefits like Jessie touches upon in her blog post, wrt flexibility and patterns you can use with multiple containers sharing some namespaces, etc. And from the perspective of languages that do not compile to a native binary the containers offer a uniform way to package and deploy an application.

When I was at QuizUp and we decided to switch our deployment units to docker containers we had been deploying using custom-baked VM's (AMI's). When we first started doing that it was due to our immutable infrastructure philosophy, but soon it became a relied-upon and necessary abstraction to homogeneously deploy services whether they were written in python, java, scala, go, or c++.

Using docker containers allowed us to keep that level of abstraction while reducing overheads significantly, and due to the dockers being easy to start and run anywhere we became more infrastructure agnostic at the same time.

steinnes commented on Why FOSS mobile communication matters   medium.com/@zecke/why-fos... · Posted by u/zecke
steinnes · 9 years ago
Compelling arguments for why next generation wireless networks should be standardised by a more open and inclusive body?
steinnes commented on Snap Inc. S-1   sec.gov/Archives/edgar/da... · Posted by u/harryh
uppercasenut · 9 years ago
Why take the chance when you're about to have a lot of IPO cash? Go with Google for a few years until you (hopefully) become FB. Right now they'll focus on features, signing advertisers and users.
steinnes · 9 years ago
Absolutely.

I imagine they had a very strong bargaining position willing to commit this much, for this long. Even if Snapchat wasn't a valuable brand for Google to brag about, this amounts to ~10% of the yearly earnings [1] of Google's cloud business (SaaS and IaaS, of which I suspect SaaS like Apps for Work is the lion's share)

I'm sure they're getting a good deal, and can focus on features and getting their platform profitable, as you said.

1. "..at that pace, Google’s cloud could generate $4.1 billion in revenue in 2016" http://www.networkworld.com/article/3029164/cloud-computing/...

steinnes commented on Show HN: Jet – Codeship’s CI Platform for Docker   codeship.com?utm_source=j... · Posted by u/moritzplassnig
efficacy · 10 years ago
I love the idea of this, and I have been spending a lot of time recently with docker compose and jet. The problem I keep coming up against is the lack of documentation for jet. It uses a similar-but-different approach to docker compose. Docker compose has tons of stuff available, but is focussed strongly on describing a complete live system rather than setting up for CI. Unfortunately, so far I have been completely unable to find any detailed documentation for jet other than a few examples which don't do quite what I need, so fussing with jet's yml files takes much longer and is very frustrating.

I know the product is quite new, and I'm willing to cut it a bit of slack. I'm also willing to contribute to a beta program or write documentation, blog posts etc. to help others if there's any way I could get access to more detailed information (source code with an NDA if that's all you have)

Can anyone help?

steinnes · 10 years ago
I'm having similar feelings. I wish Jet was better documented. I hope the Codeship team writes some concise documentation for it, but I wonder if it's against their best interest because it will enable tech savvier customers to run the same CI infrastructure locally, as they are charging for in their service?
steinnes commented on Glassdoor: Airbnb dethrones Google as the best tech company to work for in U.S   venturebeat.com/2015/12/0... · Posted by u/aritraghosh007
steinnes · 10 years ago
It's funny how Amazon isn't mentioned in the article as one of the tech companies that should by vying for top position.

Although I suppose in raw numbers most of their staff are not engineers, so maybe it wouldn't be a fair comparison.

steinnes commented on Facebook Relay: An Evil And/Or Incompetent Attack on REST   pandastrike.com/posts/201... · Posted by u/mwcampbell
devtique · 10 years ago
REST is one of the worst tech religions ever created yielding the most blinkered zealots. It's treated like a Bible where every word is taken as the Gospel truth that can't be tested, validated, compared or improved upon. Any alternative technology that reduces latency, improves performance and end user experience is considered an evil intrusion invalidating the purity of REST and must be vanquished.

In the name of REST, practitioners give themselves a free-ticket to develop large, over-architected, dumb chatty high-latency solutions at the expense of the end user as long as tech choices are made within their interpretation of REST. Normally technology serves the client, unless you're a REST zealot in which case what the needs of the client is secondary, its more important to obtain Internet kudos points by forcing your way up the maturity ladder.

No we must develop and shoe-horn all App and User experiences within the constraints of an ambiguous thesis that was built to link and update documents and create server-driven turn-by-turn apps. The fact they can't correctly interpret what different parts of REST means amongst themselves have generated programmer-decades worth of wasted discussions in the most useless bikeshed ever.

steinnes · 10 years ago
Well said. My blood boils thinking back to all the times I've heard "Yeah, but that's not REST." as a response to a perfectly reasonable and sane idea/suggestion.

u/steinnes

KarmaCake day230March 25, 2011
About
Software developer from Iceland, I enjoy bits and bytes.
View Original