kache_ (u/kache_) - Readit News

kache_ commented on Claude Code is being dumbed down? symmetrybreak.ing/blog/cl... · Posted by u/WXLCKNO

bcherny · a month ago

Hey, Boris from the Claude Code team here. I wanted to take a sec to explain the context for this change.

One of the hard things about building a product on an LLM is that the model frequently changes underneath you. Since we introduced Claude Code almost a year ago, Claude has gotten more intelligent, it runs for longer periods of time, and it is able to more agentically use more tools. This is one of the magical things about building on models, and also one of the things that makes it very hard. There's always a feeling that the model is outpacing what any given product is able to offer (ie. product overhang). We try very hard to keep up, and to deliver a UX that lets people experience the model in a way that is raw and low level, and maximally useful at the same time.

In particular, as agent trajectories get longer, the average conversation has more and more tool calls. When we released Claude Code, Sonnet 3.5 was able to run unattended for less than 30 seconds at a time before going off the rails; now, Opus 4.6 1-shots much of my code, often running for minutes, hours, and days at a time.

The amount of output this generates can quickly become overwhelming in a terminal, and is something we hear often from users. Terminals give us relatively few pixels to play with; they have a single font size; colors are not uniformly supported; in some terminal emulators, rendering is extremely slow. We want to make sure every user has a good experience, no matter what terminal they are using. This is important to us, because we want Claude Code to work everywhere, on any terminal, any OS, any environment.

Users give the model a prompt, and don't want to drown in a sea of log output in order to pick out what matters: specific tool calls, file edits, and so on, depending on the use case. From a design POV, this is a balance: we want to show you the most relevant information, while giving you a way to see more details when useful (ie. progressive disclosure). Over time, as the model continues to get more capable -- so trajectories become more correct on average -- and as conversations become even longer, we need to manage the amount of information we present in the default view to keep it from feeling overwhelming.

When we started Claude Code, it was just a few of us using it. Now, a large number of engineers rely on Claude Code to get their work done every day. We can no longer design for ourselves, and we rely heavily on community feedback to co-design the right experience. We cannot build the right things without that feedback. Yoshi rightly called out that often this iteration happens in the open. In this case in particular, we approached it intentionally, and dogfooded it internally for over a month to get the UX just right before releasing it; this resulted in an experience that most users preferred.

But we missed the mark for a subset of our users. To improve it, I went back and forth in the issue to understand what issues people were hitting with the new design, and shipped multiple rounds of changes to arrive at a good UX. We've built in the open in this way before, eg. when we iterated on the spinner UX, the todos tool UX, and for many other areas. We always want to hear from users so that we can make the product better.

The specific remaining issue Yoshi called out is reasonable. PR incoming in the next release to improve subagent output (I should have responded to the issue earlier, that's my miss).

Yoshi and others -- please keep the feedback coming. We want to hear it, and we genuinely want to improve the product in a way that gives great defaults for the majority of users, while being extremely hackable and customizable for everyone else.

kache_ · a month ago

Hey, It's Damage Control person from Corporate Revenue Maximizing Team here, <5 paragraphs>

kache_ commented on Moondream 3 Preview: Frontier-level reasoning at a blazing speed moondream.ai/blog/moondre... · Posted by u/kristianp

kache_ · 5 months ago

it's honestly really good. Big fan of that team, they are really practical and have been producing really useful software and sharing all their learnings online.

kache_ commented on Avante.nvim: Use Your Neovim Like Using Cursor AI IDE github.com/yetone/avante.... · Posted by u/simonpure

kache_ · 2 years ago

the best part about this is that you can just change the extension. like you are actually allowed to. whereas the extension experience on vscode would require a reload, and on cursor is not possible

kache_ commented on Large language models are having their Stable Diffusion moment simonwillison.net/2023/Ma... · Posted by u/simonw

pedrovhb · 3 years ago

One thing I think will be different and that had totally escaped my radar until recently is just the enormous and diverse community that has been developing around Stable Diffusion, which I think will be less likely to form with language models.

I just recently tried out one of the most popular [0] Stable Diffusion WebUIs locally, and I'm positively surprised at how different it is to the rest of the space around ML research/computing. I consider myself to be a competent software engineer, but I still often find it pretty tricky to get e.g. HuggingFace models running and doing what I envision them to do. SpeechT5 for instance is reported to do voice transformations, but it took me a good bit of time and hair-pulling to figure out how to extract voice embeddings from .wav files. I'm sure the way to do this is obvious to most researchers, maybe to the point of feeling like this needs not a mention in the documentation, but it certainly wasn't clear to me.

The community around Stable Diffusion is much more inclusive, though. Tools go the extra effort to be easy to use, and documentation for community created models/scripts/tools is so accessible as to be perfectly usable by a non-technical user who is willing to adventure a little bit into the world of hardcore computing by following instructions. Sure, nothing is too polished and you often get the feeling that it's "an ugly thing, but an ugly thing that works", but the point is that it's incredibly accessible. People get to actually use these models to build their stories, fantasy worlds, to work, and things get progressively more impressive as the community builds upon itself (I loved the style of [1] and even effortlessly merged its style with another one in the WebUI, and ControlNet [2] is amazing and gives me ideas for integrating my photography with AI).

I think the general interest in creating images is larger than for LLMs with their current limitations (especially in current consumer-available hardware). I do wonder how much this community interest will boost the spaces in the longer run, but right now I can't help but be impressed by the difference in usability and collaborative development between image generative and other types of models.

[0] https://github.com/AUTOMATIC1111/stable-diffusion-webui

[1] https://civitai.com/models/4998/vivid-watercolors

[2] https://github.com/Mikubill/sd-webui-controlnet

kache_ · 3 years ago

Did you know that AUTOMATIC1111 got strapped off of 4chan?

Go to 4chan right now, and poke around their technology and video game boards. There's so much chatter about LLaMa. The last time I saw that much chatter about a technology was when eth was 3 dollars a coin. The communities exist, the general public just isn't aware of them.

kache_ commented on Ask HN: How do ADHD people cope on here? · Posted by u/WhackyIdeas

kache_ · 3 years ago

i don't, but it all seems to work out anyways

kache_ commented on Ask HN: Math books that made you significantly better at math? · Posted by u/optbuild

kache_ · 3 years ago

linear algebra done right - sheldon axler

kache_ commented on Apple Silicon Mac’s have 2-3 times longer battery than PC laptops youtube.com/watch?v=P0h8q... · Posted by u/retskrad

diimdeep · 3 years ago

Apple deserve more market share.

"People who are really serious about software should make their own hardware" https://www.youtube.com/watch?v=XAfTXYa36f4

kache_ · 3 years ago

maybe once they allow me to actually own my hardware (let me install whatever software I want (linux))

kache_ commented on GitHub is sued, and we may learn something about Creative Commons licensing scholarlykitchen.sspnet.o... · Posted by u/doener

mattstir · 3 years ago

Interestingly, the same companies who made the paid service by analyzing all of that open-source code would never, ever consider open-sourcing the code for that service.

> if you don't want people to learn off your code

"Learn" is a strange verb to use here. No one at Microsoft or OpenAI was scraping all of GitHub so that they could learn. They took people's licensed works, fed it into a very sophisticated copy-paste machine, and started making money off of it.

> just don't share it!

It's almost like licenses and copyright exist to protect the rights of their holders or something.

The entire point of licenses is to be able to share your work in a way that respects your wishes. "Just don't share it" is completely non-productive.

kache_ · 3 years ago

the cost of training is exponentially decreasing. It's only a matter of time before a codex-like model is released to the open.