Claude Sonnet 5 Is Imminent – and It Could Be a Generation Ahead of Google

pllbnk · 9 days ago

Depends on what is a ‘generation’ for LLMs. It would be weird to build a model which is a generation behind. My guess is that like all models, it will be considered the best until the novelty factor wears off and then it will be more or less the same like all modern LLMs - better in some domains, worse in others.

Edit: and it will probably also lead in most major benchmarks which says next to nothing about the quality.

fastThinking · 9 days ago

Being ahead of Google is less about raw model quality and more about shipping usable products fast. Anthropic’s advantage seems organizational as much as technical. If Sonnet 5 really halves inference cost while improving reasoning, that’s more disruptive than any benchmark win.

RivieraKid · 9 days ago

Aren't people worried about their jobs? I'm surprised that this aspect is almost entirely missing in threads like this.

zhshsha · 9 days ago

Have you actually used these tools?

My CTO is pushing 30k line PRs and when asked “how do you know it works” all he can say is “I’m not sure but it probably does. Our customers can QA”. Meanwhile I’m cleaning up half vibed messes from my coworkers that demo’d well.

They’re very powerful, but I think their marketing departments are even more powerful. I do wonder how many of these comments are real people.

JamesSwift · 7 days ago

Alternatively:

Have you actually used these tools?

They work like magic but are not magical. Theres a skill in using them well, and getting good output. Its not just an automatic free lunch button. Good engineers become great engineers and the gap widens, as juniors/outsourcing gets pushed out of the market.

Gud · 8 days ago

I use Claude and ChatGPT to build a highly complex piece of software. It is working, and it is working well.

It probably helps that I am a skilled programmer while being an expert in another domain.

My software development has gone from write/read/execute to read/execute and I’m fine with that.

RivieraKid · 9 days ago

Not much, only the non-paid non-agent stuff. It's pretty impressive but my estimate is at best a 2x productivity jump for general use.

My worry is that the agentic stuff is reportedly a significant improvement and getting better quickly.

panarky · 9 days ago

Jobs are poison.

Why is everyone so worried about poison going away?

OutOfHere · 8 days ago

So what's your alternative to jobs? Don't say free money (UBI) because that will never happen in the US.

palmotea · 9 days ago

> Jobs are poison.

> Why is everyone so worried about poison going away?

Hello, aloof galaxy-brain. If you weren't aware, we live in a capitalist economy. In capitalism, if you don't have money you are in a state we call "poor," which means your life is difficult and your living standards bad. Most people rely on having a job to make money, and usually need a well-paying job to be comfortable and secure.

AI isn't going to change any of that. It's not going to make energy, or land or housing more abundant, in fact it will probably make all of those things more scarce.

If the future doesn't have "jobs," there will probably be a holocaust of workers before we get there. IMHO, that's more likely that some kind of utopian techno-socialism.

keyle · 9 days ago

Say, you can vibe design your next house.

Would you want that?

Isn't a house too personal that you'd want to get a professional architect with experience to design it, and sign off on it? Even if they used advanced tools like CAD and copy pastes 8/10 of it?

Sure, you can probably one shot notepad.exe but it has no meaning. Meaningful work isn't going anywhere, for the reason that meaningful work lives and lives on by people for people.

No one wants a vibe designed car, unless you are one of those psychos that has no tastes and doesn't care about anything.

thedevilslawyer · 9 days ago

Have you worked with a professional architect. Cost adds up fast, and you get 1-2 iterations?

I'd love to work and vibecode the house to my full liking, assuming that the agent harness will take care of all the nonfunctional things (stable design, zoning etc). Same for car if I could customize it I would.

(I definitely don't like the ramifications of it on the economy/jobs, but the above are pure consumer wins, no doubt)

johnstenner · 8 days ago

I absolutely would want a vibe designed house or car. Vibe to completion, then have a now higher paid expert architect/engineer review the plan to confirm it is safe to proceed.

Until we can reliably trust these models to skip that last step, there is nothing inherently wrong with using AI to create, so long as it is manually reviewed for accuracy at the end.

someguyiguess · 8 days ago

You guys still have jobs???

spants · 9 days ago

control ai or be controlled by it.

Learn everything that you can about AI and you will be a great resource. Otherwise, learn a trade. Electricians will be required...........

RivieraKid · 9 days ago

The percentage of people employed in agriculture dropped from 80% to 2%. The market will be full of people who are willing to learn everything about AI in order to have a comfortable and highly paid job.

Becoming an electrician would be a downgrade or even impossible for some people.

For the record, I think AI replacing highly paid "sitting behind a computer" jobs would be good for the society, but probably not for most people having these jobs.

thomasfromcdnjs · 9 days ago

I keep trying to use Codex CLI but I love using claude --dangerously-skip-permissions but this seems impossible to do in codex, and it just asks me to approve every command per session. Am I taking crazy pills or is there a way to make codex just run in yolo mode?

lostmsu · 9 days ago

--yolo

could find in --help

solumunus · 9 days ago

How long is a generation with LLM’s, 6 months?

column · 9 days ago

In 2022 midjourney's CEO said anything they release now would be obsolete in 6 months time. That seemed wild, but he was right.

touwer · 9 days ago

The article itself seems to be written using an llm from 1950

Havoc · 9 days ago

It’s definitely going to be a busy month in model land. Loads of new stuff is scheduled to drop.

I think it’s premature to say what’s going to beat what though

tajd · 9 days ago

what are the key references for this article? there was a tweet but also a screenshot of an error code in vertex ai, right?