The GPT-5 Launch Was Concerning

hodgehog11 · 7 months ago

Yes, GPT-5 is more of an iteration than anything else, and to me this says more about OpenAI than the rest of the industry. However, I think the majority of the improvements over the past year have been difficult to quantify using benchmarks. Users often talk about how certain models "feel" smarter on their particular tasks, and we won't know if the same is true for GPT-5 until people use it for a while.

The "GPT-5 will show AGI" hype was always a ridiculously high bar for OpenAI, and I would argue that the quest for that elusive AGI threshold has been an unnecessary curse on machine learning and AI development in general. Who cares? Do we really want to replace humans? We should want better and more reliable tools (like Claude Code) to assist people, and maybe cover some of the stuff nobody wants to do. This desire for "AGI" is delivering less value and causing us to put focus on creative tasks that humans actually want to do, putting added stress on the job market.

The one really bad sign in the launch, at least to me, was that the developers were openly admitting that they now trust GPT-5 to develop their software MORE than themselves ("more often than not, we defer to what GPT-5 says"). Why would you be proud of this?

Someone · 7 months ago

> Users often talk about how certain models "feel" smarter on their particular tasks, and we won't know if the same is true for GPT-5 until people use it for a while.

The idea that models “feel” smarter may be 100% human psychology. If you invest in a new product, admitting that it isn’t better than what you had is hard for humans. So, if users say a model “feels” smarter, we won’t know that it really is smarter.

Also, if users manage to improve quality of responses after using it for a while, who says they couldn’t have reached similar results if they stayed using the old tool, tweaking their prompts to make that model perform better?

9rx · 7 months ago

> Do we really want to replace humans?

AGI doesn't really replace humans, it merely provides a unified model that can be hooked up to carry out any number of tasks. Fundamentally no different than how we already write bespoke firmware for every appliance, except instead of needing specialized code for each case, you can simply use the same program for everything. To that extent, software developers have always been trying to replace humans — so the answer from the HN crowd is a resounding yes!

> We should want better and more reliable tools

Which is what AGI enables. AGI isn't a sentience that rises up to destroy us. There may be some future where technology does that, but that's not what we call AGI. As before, it is no different than us writing bespoke software for every situation, except instead of needing a different program for every situation, you have one program that can be installed into a number of situations. Need a controller for your washing machine? Install the AGI software. Need a controller for your car's engine? Install the same AGI software!

It will replace the need to write a lot of new software, but I suppose that is ultimately okay. Technology replaced the loom operator, and while it may have been devastating to those who lost their loom operator jobs, is anyone today upset about not having to operate a loom? We found even more interesting work to do.

hodgehog11 · 7 months ago

> Which is what AGI enables.

I appreciate the well-crafted response, but respectfully disagree with this sentiment, and I think it's a subtle point. Remember the no free lunch theorems: no general program will be the best at all tasks. Competent LLMs provide an excellent prior from which a compelling program for a particular task can be obtained by finetuning. But this is not what OpenAI, Google, and Anthropic (to a lesser extent) are interested in, as they don't really facilitate it. It's never been a priority.

They want to create a digital entity for the purpose of supremacy. Aside from DeepMind, these groups really don't care about how this tech can assist in problems that need solving, like drug discovery or climate prediction or discovery of new materials (e.g. batteries) or automation of hell jobs. They only care about code assistance to accelerate their own progress. I talk to their researchers at conferences and it frustrates me to no end. They want to show off how "human-like" their model is, how it resembles humans in creative writing and painting, how it beats humans on fun math and coding competitions that were designed for humans with a limited capacity to memorize, how it provides "better" medical opinions than a trained physician. That last use case is pushing governments to outlaw LLMs for medicine entirely.

A lab that claims to push toward AGI is not interested in assisting mankind toward a brighter future. They want to be the first for bragging rights, hype, VC funding, and control.

usefulcat · 7 months ago

> Why would you be proud of this?

Isn't it obvious? They have a huge vested interest in getting people to believe that it's very useful, capable, etc.

bluefirebrand · 7 months ago

> Do we really want to replace humans?

Unfortunately for a substantial number of people the answer to this question seems to be a resounding "yes"

gtirloni · 7 months ago

With those people being business owners, investors, etc, 100% of the time.

The other 99% would like automation to make their lives easier. Who wouldn't want the promised tech utopia? Unfortunately, that's not happening so it's understandable that people are more concerned than joyous about AI.

scrozart · 7 months ago

The devs have been co-opted into marketing roles now, too - they have to say it's that good to keep the money coming in. IMO this reinforces the original post - this all feels like a scramble.

Whether it's indicative of patterns beyond OpenAI remains to be seen, but I don't expect much originality from tech execs.

mutkach · 7 months ago

The focus now is not the model, but the Product - "here we improve the usuability by removing the choice between models", "here is a better voice for tts", "here is a nice interface for previewing html"

Only about 5 minutes of the whole presentation are dedicated to enterprise usage (COO in an interview sort of indirectly confirms that haven't figured it out yet). And they are cutting the costs already (opaque routing between models for non-API users is a clear sign of that). The term "AGI" is dropped, no more exponential scaling bullshit - just incremental changes over the time and only over select few domains. Actually it is a more welcoming sign and not concerning at all that this technology matures and crystallizes around this point. We will charitably forget and forgive all the insane claims made by Sam Altman in the previous years. He can also forget about cutting ties with Microsoft for that same reason.

baggachipz · 7 months ago

> ...when they need to find any way possible to squeeze paid subscribers out of their (money losing) free user base.

Also note that they're losing money on their paid subscribers.

xnx · 7 months ago

Given the difference between GPT-3 and GPT-4, a fair numbering for "GPT-5" is probably "GPT-4.2".

ndr_ · 7 months ago

Some of the problems with GPT-5 in ChatGPT could actually be due to new model that is in place to route requests to the actual GPT-5 models. There are four models in the GPT-5 family, and I could reproduce the faulty "blueberry" test result only with the "gpt-5-chat" (aka "gpt-5-main") model through the API. This model is there to answer (near) instantly and it falls in the non-thinking category of LLMs. The "blueberry" test represents what they are particularly bad at (and what OpenAI set out to solve with o1). The other thinking models in the family, including gpt-5-nano, solve this correctly.

profstasiak · 7 months ago

so can we please stop talking of AGI until counting letter in a word are not hard?

Havoc · 7 months ago

The messaging is all over the place anyway. Not so long ago OAI was talking about faster iterations and warning people to not expect huge leaps. (A position that makes sense imo). Yet people talk about AGI in a serious manner?

gtirloni · 7 months ago

> We are now confident we know how to build AGI as we have traditionally understood it. We believe that, in 2025, we may see the first AI agents "join the workforce" and materially change the output of companies. We continue to believe that iteratively putting great tools in the hands of people leads to great, broadly-distributed outcomes.

"Reflections" by Sam Altman, January 2025 - https://blog.samaltman.com/reflections

throwfaraway4 · 7 months ago

"AGI achieved internally" -- Sam Altman

infecto · 7 months ago

Yes a quote from a meme post on Reddit. No doubt he has been overselling the future for a while but why use a quote with the wrong context?

nocoiner · 7 months ago

“Hidden, poorly internally labeled fiat@ account”

coldpie · 7 months ago

I don't think anyone serious is talking about AGI from LLMs, no.

AlexandrB · 7 months ago

Only if you consider Sam Altman not serious: https://www.tomsguide.com/ai/chatgpt/sam-altman-claims-agi-i...

I find this pattern in tech hype really frustrating. Someone in a leadership role in a major tech company/VC promises something outrageous. Time passes and the promise never materializes. People then retcon the idea that "everybody knew that wasn't going to happen". Well either "everybody" doesn't include Elon Musk[1], Sam Altman, or Marc Andreessen[2] or these people are liars. No one seems to be held to their track record of being right or wrong, instead people just latch on to the next outrageous promise as if the previous one was fulfilled.

[1] https://electrek.co/2025/03/18/elon-musk-biggest-lie-tesla-v...

[2] https://dailyhodl.com/2022/06/01/billionaire-and-tech-pionee...

phist_mcgee · 7 months ago

And yet altman talks about AGI being imminent, but his company has only ever produced LLMs.

rvz · 7 months ago

Am I right to say that "AGI" was just...cancelled again?

Did we just get scammed right in front of our eyes with an overhyped release and what is now an underwhelming model if the point was that GPT-5 was supposed to be trustworthy enough for serious use-cases and it can't even count or reason about letters?

So much for the "AGI has been achieved internally" nonsense with the VCs and paid shills on X/Twitter bullshitting about the model before the release.

mutkach · 7 months ago

Not only "AGI" is cancelled but they also sort of admitted that so-called "scaling" "laws" don't work anymore. Scaling inference kinda still works, but obviously is bounded by context size and haystack-and-needle diminishing accuracy. So the promise of even steadily moving towards AGI is dubious at best.

baobun · 7 months ago

How were you still eating that?

rvz · 7 months ago

Never was. Knew it was a redefinition confidence trick years ago.

davydm · 7 months ago

Issues like this are why I don't use ai agents for code. I don't want to sift through the bullshit confidently spewed out by the model.

It doesn't understand anything. It can't possibly "understand my codebase". It can only predict tokens, and it can only be useful if the pattern has been seen before. Even then, it will product buggy replicas, which I've pointed out during demos. I disabled the ai helpers in my IDEs because the slop the produce is not high quality code, often wrong, often misses what I wanted to achieve, often subtly buggy. I don't have the patience to deal with that, and I don't want to waste the time on it.

Time is another aspect of this conversation, with people claiming time wins, but the data not backing it up, possibly due to a number of factors intrinsic to our squishy evolved brains. If you're interested, go find gurwinder's article on social media and time - I think the same forces are at work in the ai-faithful.

mrits · 7 months ago

There is a threshold that every developer needs for them to make it be worth their time. For me that has already been met. Your comment makes me think that you don't believe it will start producing higher quality code than you anytime soon.

I think most of us are in the camp that even though we don't need AI right now we believe we will not be valuable in the near future without being highly proficient with the tooling.

bluefirebrand · 7 months ago

> even though we don't need AI right now we believe we will not be valuable in the near future

This reads to me like you don't think you're valuable right now either