I've seen a rumor going around that OpenAI hasn't had a successful pre-training run since mid 2024. This seemed insane to me but if you give ChatGPT 5.1 a query about current events and instruct it not to use the internet it will tell you its knowledge cutoff is June 2024. Not sure if maybe that's just the smaller model or what. But I don't think it's a good sign to get that from any frontier model today, that's 18 months ago.
The SemiAnalysis article that you linked to stated:
"OpenAI’s leading researchers have not completed a successful full-scale pre-training run that was broadly deployed for a new frontier model since GPT-4o in May 2024, highlighting the significant technical hurdle that Google’s TPU fleet has managed to overcome."
Given the overall quality of the article, that is an uncharacteristically convoluted sentence. At the risk of stating the obvious, "that was broadly deployed" (or not) is contingent on many factors, most of which are not of the GPU vs. TPU technical variety.
This is a really great breakdown. With TPUs seemingly more efficient and costing less overall, how does this play for Nvidia? What's to stop them from entering the TPU race with their $5 trillion valuation?
It's not a rumor, it's confirmed by OpenAI. All "models" since 4o are actually just optimizations in prompting and a new routing engine. The actual -model- you are using with 5.1 is 4. Nothing has been pre-trained from scratch since 4o.
Their own press releases confirm this. They call 5 their best new "ai system", not a new model
I can believe this, Deepseek V3.2 shows that you can get close to "gpt-5" performance with a gpt-4 level base model just with sufficient post-training.
I don't think that counts as confirmation. 4.5 we know was a new base-model. I find it very very unlikely the base model of 4 (or 4o) is in gpt5. Also 4o is a different base model from 4 right? it's multimodal etc. Pretty sure people have leaked sizes etc and I don't think it matches up.
New AI system doesn't preclude new models. I thought when GPT 5 launched and users hated it the speculation was GPT 5 was a cost cutting model and the routing engine was routing to smaller, specialized dumber models that cost less on inference?
It certainly was much dumber than 4o on Perplexity when I tried it.
Maybe this is just armchair bs on my part, but it seems to me that the proliferation of AI-spam and just general carpet bombing of low effort SEO fodder would make a lot of info online from the last few years totally worthless.
Hardly a hot take. People have theorized about the ouroboros effect for years now. But I do wonder if that’s part of the problem
Every so often I try out a GPT model for coding again, and manage to get tricked by the very sparse conversation style into thinking it's great for a couple of days (when it says nothing and then finishes producing code with a 'I did x, y and z' with no stupid 'you're absolutely' right sucking up and it works, it feels very good).
But I always realize it's just smoke and mirrors - the actual quality of the code and the failure modes and stuff are just so much worse than claude and gemini.
I am a novice programmer -- I have programmed for 35+ years now but I build and lose the skills moving between coder to manager to sales -- multiple times. Fresh IC since last week again :) I have coded starting with Fortran, RPG and COBOL and I have also coded Java and Scala. I know modern architecture but haven't done enough grunt work to make it work or to debug (and fix) a complex problem. Needless to say sometimes my eyes glaze over the code.
And I write some code for my personal enjoyment, and I gave it to Claude 6-8 months back for improvement, it gave me a massive change log and it was quite risky so abandoned it.
I tried this again with Gemini last week, I was more prepared and asked it to improve class by class, and for whatever reasons I got better answers -- changed code, with explanations, and when I asked it to split the refactor in smaller steps, it did so. Was a joy working on this over the thanksgiving holidays. It could break the changes in small pieces, talk through them as I evolved concepts learned previously, took my feedback and prioritization, and also gave me nuanced explanation of the business objectives I was trying to achieve.
This is not to downplay claude, that is just the sequence of events narration. So while it may or may not work well for experienced programmers, it is such a helpful tool for people who know the domain or the concepts (or both) and struggle with details, since the tool can iron out a lot of details for you.
My goal now is to have another project for winter holidays and then think through 4-6 hour AI assisted refactors over the weekends. Do note that this is a project of personal interest so not spending weekends for the big man.
I'm starting with Claude at work but did have an okay experience with OpenAi so far. For clearly delimited tasks it does produce working code more often than not. I've seen some improvement on their side compared to say, last year. For something more complex and not clearly defined in advance, yes, it does produce plausible garbage and it goes off the rails a lot. I was migrating a project and asked ChatGPT to analyze the original code base and produce a migration plan. The result seemed good and encouraging because I didn't know much about that project at that time. But I ended up taking a different route and when I finished the migration (with bits of help from ChatGPT) I looked at the original migration plan out of curiosity since I had become more familiar with the project by now. And the migration plan was an absolutely useless and senseless hallucination.
On the contrary, I cannot use the top Gemini and Claude models because their outputs are so out place and hard to integrate with my code bases. The GPT 5 models integrate with my code base's existing patterns seamlessly.
At this point you are now forced to use the "AI"s as code search tools--and it annoys me to no end.
The problem is that the "AI"s can cough up code examples based upon proprietary codebases that you, as an individual, have no access to. That creates a significant quality differential between coders who only use publicly available search (Google, Github, etc.) vs those who use "AI" systems.
Same experience here. The more commonly known the stuff it regurgitates is, the fewer errors. But if you venture into RF electronics or embedded land, beware of it turning into a master of bs.
Which makes sense for something that isn’t AI but LLM.
I recall reading that Google had similar 'delay' issues when crawling the web in 2000 and early 2001, but they managed to survive. That said, OpenAI seems much less differentiated (now) than Google was back then, so this may be a much riskier situation.
The 25x revenue multiple wouldn't be so bad if they weren't burning so much cash on R&D and if they actually had a moat.
Google caught up quick, the Chinese are spinning up open source models left and right, and the world really just isn't ready to adopt AI everywhere yet. We're in the premature/awkward phase.
They're just too early, and the AGI is just too far away.
Doesn't look like their "advertising" idea to increase revenue is working, either.
I noticed this recently when I asked it whether I should play Indiana Jones on my PS5 or PC with a 9070 XT. It assumed I had made a typo until I clarified, then it went off to the internet and came back telling me what a sick rig I have.
OpenAI is the only SOTA model provider that doesn't have a cutoff date in the current year. That why it preforms bad at writing code for any new libraries or libraries that have had significant updates like Svelte.
State Of The Art is maybe a bit exaggerated. It's more like an early model that never really adapted, and only got watered down (smaller network, outdated information, and you cannot see thought/reasoning).
Also their models get dumber and dumber over time.
I asked ChatGPT 5.1 to help me solve a silly installation issue with the codex command line tool (I’m not an npm user and the recommended installation method is some kludge using npm), and ChatGPT told me, with a straight face, that codex was discontinued and that I must have meant the “openai” command.
The fundamental problem with bubbles like this, is that you get people like this who are able to take advantage of the The Gell-Mann amnesia effect, except the details that they’re wrong about are so niche that there’s a vanishingly small group of people who are qualified to call them out on it, and there’s simultaneously so much more attention on what they say because investors and speculators are so desperate and anxious for new information.
I followed him on Twitter. He said some very interesting things, I thought. Then he started talking about the niche of ML/AI I work near, and he was completely wrong about it. I became enlightened.
Funny, had it tell me the same thing twice yesterday and that was _with_ thinking + search enabled on the request (it apparently refused to carry out the search, which it does once in every blue moon).
I didn't make this connection that the training data is that old, but that would indeed augur poorly.
Just a minor correction, but I think it's important because some comments here seem to be giving bad information, but on OpenAI's model site it says that the knowledge cutoff for gpt-5 is Sept 30, 2024, https://platform.openai.com/docs/models/compare, which is later than the June 01, 2024 date of GPT-4.1.
Now I don't know if this means that OpenAI was able to add that 3 months of data to earlier models by tuning or if it was a "from scratch" pre-training run, but it has to be a substantial difference in the models.
Pre-training is just training, it got the name because most models have a post-training stage so to differentiate people call it pre-training.
Pre-training: You train on a vast amount of data, as varied and high quality as possible, this will determine the distribution the model can operate with, so LLMs are usually trained on a curated dataset of the whole internet, the output of the pre-training is usually called the base model.
Post-training: You narrow down the task by training on the specific model needs you want. You can do this through several ways:
- Supervised Finetuning (SFT): Training on a strict high quality dataset of the task you want. For example if you wanted a summarization model, you'd finetune the model on high quality text->summary pairs and the model would be able to summarize much better than the base model.
- Reinforcement Learning (RL): You train a separate model that ranks outputs, then use it to rate the output of the model, then use that data to train the model.
- Direct Preference Optimizaton (DPO): You have pairs of good/bad generations and use them to align the model towards/away the kinds of responses you want.
Post-training is what makes the models able to be easily used, the most common is instruction tuning that teaches to model to talk in turns, but post-training can be used for anything. E.g. if you want a translation model that always translates a certain way, or a model that knows how to use tools, etc. you'd achieve all that through post-training. Post-training is where most of the secret sauce in current models is nowadays.
The first step in building a large language model. That's when the model is initiated and trained on a huge dataset to learn patterns and whatnot. The "P" in "GPT" stands for "pre-trained."
> Q: Are the releases aligned with pre-training efforts?
> A: There used to be a time not that long ago, maybe half a year, distant past, where the models would align with RL runs or pretraining runs ... now the naming is by capability. GPT5 is a capable model; 5.1 is a more capable model
I doubt it's that important that their dataset of current events is up to date. At this stage, I believe private and synthetic data comprises a large fraction of pretraining. Web search substitutes for current event pretraining.
I tried OpenAI models for coding in Go, but they constantly say your syntax is not correct. Let me rewrite your whole file without `any`.`any` was introduced in 2022. It takes some time to adopt it in codebases, but they should not be doing stuff like that at the end of 2025.
I’m a GPT‑4‑based model that OpenAI released on March 14 2023.
The underlying training data goes up to early 2023, and the model was trained in the months leading up to that release. If you’re asking about the ChatGPT product that ships the model to users, it went live in November 2022 and has since received updates (GPT‑3.5, GPT‑4, etc.) that keep it current.
I’m a language model created by OpenAI. The current generation (GPT‑4) that powers this chat was first released in March 2023 and has been updated and fine‑tuned up through the end of 2024. My training data runs up to the beginning of June 2025, so I’m built on knowledge available up to that point.
OpenAI is basically just Netscape at this point. An innovative product with no means of significant revenue generation.
One one side it's up against large competitors with an already established user base and product line that can simply bundle their AI offerings into those products. Google will do just what Microsoft did with Internet Explorer and bundle Gemini in for 'Free' with their already other profitable products and established ad-funded revenue streams.
At the same time, Deepseek/Qwen, etc. are open sourcing stuff to undercut them on the other side. It's a classic squeeze on their already fairly dubious business model.
In 2024, OpenAI claimed the bulk of its revenue was 70-80% through consumer ChatGPT subscriptions. That's wildly impressive.
But now they've had an order of magnitude revenue growth. That can't still be consumer subscriptions, right? They've had to have saturated that?
I haven't seen reports of the revenue breakdown, but I imagine it must be enterprise sales.
If it's enterprise sales, I'd imagine that was sold to F500 companies in bulk during peak AI hype. Most of those integrations are probably of the "the CEO has tasked us with `implementing an AI strategy`" kind. If so, I can't imagine they will survive in the face of a recession or economic downturn. To be frank, most of those projects probably won't pan out even under the rosiest of economic pictures.
We just don't know how to apply AI to most enterprise automation tasks yet. We have a long way to go.
I'd be very curious to see what their revenue spread looks like today, because that will be indicative of future growth and the health of the company.
This is pretty much all that OpenAI is at the moment.
Mozilla is a non-profit that is only sustained by the generous wealthy benefactor (Google) to give the illusion that there is competition in the browser market.
OpenAI is a non-profit funded by a generous wealthy benefactor (Microsoft).
Ideas of IPO and profitability are all just pipe dreams in Altmans imagination.
anecdotal, but my wife wasn't interested in switching to claude from chatgpt. as far as she's concerned chatgpt knows her, and she's got her assistant perfectly tuned to her liking.
ChatGPT is to AI as Facebook is to social media. OpenAI captured a significant number of users due to first-mover advantage, but that advantage is long gone now.
this is my horror as well. I don't mind my youtube account to be blocked but what about all the recommendations that I have curated to my liking. It will be huge chunk of lost time to rebuild and insert my preferences into the algorithm. increasingly "our preferences shaped by time and influences and encounters both digital and offline" are as much about us as we are physically.
> Google will do just what Microsoft did with Internet Explorer and bundle Gemini in for 'Free' with their already other profitable products and established ad-funded revenue streams.
“will do”? Is there any Google product they haven't done that with already?
Oh God I love the analogy of OpenAI being Netscape. As someone who was an adult in the 1990s, this is so apt. Companies at that time were trying to build a moat around the World Wide Web. They obviously failed. I've thought that OpenAI too would fail but I've never thought about it like Netscape and WWW.
OpenAI should be looking at how Google built a moat around search. Anyone can write a Web crawler. Lots of people have. But no one else has turned search into the money printing machine that Google has. And they've used that to fund their search advantage.
I've long thought the moat-buster here will be China because they simply won't want the US to own this future. It's a national security issue. I see things like DeepSeek is moat-busting activity and I expect that to intensify.
Currently China can't buy the latest NVidia chips or ASML lithography equipment. Why? Because the US said so. I don't expect China to tolerate this long term and of any country, China has desmonstrated the long-term commitment to this kind of project.
Literally got an email this morning from Google, to say my Google One plan now 'includes AI benefits' - including
"More access to Gemini 3 Pro, our most capable model
More access to Deep Research in the Gemini app
Video generation with limited access to Veo 3.1 Fast in the Gemini app
More access to image generation with Nano Banana Pro
Additional AI credits for video generation in Flow and Whisk
Access Gemini directly in Google apps like Gmail and Docs" [Thanks but no thanks]
I know it's been said before but it's slightly insane they're trying to compete on a hot new tech with a company with 1) a top notch reputation for AI and 2) the largest money printer that has ever existed on the planet.
Feel like the end result would always be that while Google is slow to adjust, once they're in the race they're in it it.
The problem for Google is that there is no sensible way to monetize this tech and it undercuts their main money source which is search.
On top of that the Chinese seem to be hellbent to destroy any possible moat the US companies might create by flooding the market with SOTA open-source models.
Although this tech might be good for software companies in general - it does reduce the main cost they have which is personnel. But in the long run Google will need to reinvent itself or die.
Google in 1999 was already far superior to Yahoo and other competitors. I don't think OpenAI is in a similar position there. It seems debatable as to whether they're even the best, let alone a massive leap ahead of everyone else the way Google was.
OpenAI has tons of funnels for their products. Azure’s AI smoke and mirrors offerings uses openAI behind the scenes, big with enterprise users (who has a lot of money)
It can be bundled for "free" if they raise the price of google workspace. LLMs are right now most valuable as an enterprise productivity software assistant. Very useful to have a full suite of enterprise productivity software in order to sell them.
> Google will do just what Microsoft did with Internet Explorer and bundle Gemini in for 'Free' with their already other profitable products and established ad-funded revenue streams.
Just some numbers to show what OpenAI is against:
GMail users: nearing 2 billion
Youtube MAU: 2.5 billion
active Android devices: 4 billion (!)
Market cap: 3.8 trillion (at a P/E of 31)
So on one side you've got this behemoth with, compared to OpenAI's size, unlimited funding. The $25 bn per year OpenAI is after is basically a parking ticket for Google (only slightly exaggerating). Behemoth who came with Gemini 3 Pro "thinking" and Nano Banana (that name though) who are SOTA.
And on the other side you've got the open-source weights you mentioned.
When OpenAI had its big moment HN was full of comments about how it was game over for Google for search was done for. Three years later and the best (arguably the best) model gives the best answer when you search... Using Google search.
Funny how these things turns out.
Google is atm the 3rd biggest cap in the world: only Apple and NVidia are slightly ahead. If Google is serious about its AI chips (and it looks like they are) and see the fuck-ups over fuck-ups by Apple, I wouldn't be surprised at all if Alphabet was to regain the number one spot.
That's the company OpenAI is fighting: a company that's already been the biggest cap in the entire world and that's probably going to regain that spot rather sooner than later and that happens to have crushed every single AI benchmark when Gemini 3 Pro came out.
I had a ChatGPT subscription. Now I'm using Gemini 3 Pro.
It is insignificant when they're spending more than $115bn to offer their service. And yes, I say "more than," not because I have any inside knowledge but because I'm pretty sure $115bn is a "kind" estimate and the expenditure is probably higher. But either way, they're running at a loss. And for a company like them, that loss is huge. Google could take the loss as could Microsoft or Amazon because they have lots of other revenue sources. OAI does not.
Google is embedding Gemini into Chrome Developer Tools. You can ask for an analysis of individual network calls in your browser by clicking a checkbox. That's just an example of the power of platform. They seem to be better at integration than Microsoft.
OpenAI has this amazing technology and a great app, but the company feels like some sort of financial engineering nightmare.
We live in crazy times, but given what they’ve spent and committed to that’s a drop in the bucket relative to what they need to be pulling in. They’re history if they can’t pump up the revenue much much faster.
Given that we’re likely at peak AI hype at the moment they’re not well positioned at all to survive the coming “trough of disillusionment” that happens like clockwork on every hype cycle. Google, by comparison, is very well positioned to weather a coming storm.
Every F500 CEO told their team "have an AI strategy ASAP".
In a year, when the economy might be in worse shape, they'll ask their team if the AI thing is working out.
What do you think happens to all the enterprise OpenAI contracts at that point? (Especially if the same tech layperson CEOs keep reading Forbes and hearing Scott Galloway dump on OpenAI and call the AI thing a "bubble"?)
The way I've experienced "Code Red" is mostly as a euphemism for "on-going company-wide lack of focus" and a band-aid for mid-level management having absolutely no clue how to meaningfully make progress, upper management panicking, and ultimately putting engineers and ICs on the spot to bear the brunt of that organizational mess.
Interestingly enough, apart from Google, I've never seen an organization take the actual proper steps (fire mid-management and PMs) to prevent the same thing from happening again. Will be interesting to see how OAI handles this.
> fire mid-management and PMs to prevent the same thing from happening again
Firing PMs and mid-management would not prevent any of code reds you may have read about from Google or OAI lately. This is a very naive perspective of how decision making is done at the scale of those two companies. I'm sorry you had bad experiences working with people in those positions and I wish you have the opportunity to collab with great ones in the future.
Yeah the reflexive anti-PM anti-management stance posted above is typical here and of devs in general.
In theory, some engineers think they are perfectly capable of doing all the PMs work and all their own.
If they’ve never worked with a truly good PM, that’s a shame, they’d likely get more work done in a more timely fashion. I’ve worked with around 10 different PMs, the best kept stuff on track and aided with collaboration, reqs management, soft skills, handling tough customers, etc. they free up devs to do more dev work and less other work.
>I've never seen an organization take the actual proper steps (fire mid-management and PMs) to prevent the same thing from happening again.
One time, in my entire career have I seen this done, and it is as successful as you imagine it to be. Lots of weird problems coming out from having done it, but those are being treated as "Wow we are so glad we know about this problem" rather than "I hope those idiots come back to keep pulling the wool over my eyes".
There should already be a single priority for a company...
Why is the bar so low for the billionaire magnate fuck ups? Might as well implement workplace democracy and be done with it, it can't be any worse for the company and at least the workers understand what needs to be done.
People sure are quick to say, "I hope you get to work with better management". Man, me too, but I find that dismissive of a legitimate concern: There is A LOT of incompetent management, especially in enterprise. The sad truth is that it is often the blind leading the sighted. When I was growing up I thought that the manager was someone with experience doing the job of those they managed, but across jobs I've had, this is the case about 20% of the time.
This code red also has the convenient benefit of giving an excuse to stop work on more monetization features... Which, when implemented, would have the downside of tethering OpenAI's valuation to reality.
The one successful example I can think of is Bill Gates writing a memo to re-orient Microsoft to put the Internet at the center of everything they were doing.
Your proper steps are also missing out on firing the higher level executives. But then new ones would be hired, a re-org will occur, and another Code Red will occur in a few months
The real code red here is less that Google just one-upped OpenAI but that they demonstrated there’s no moat to be had here.
Absent a major breakthrough all the major providers are just going to keep leapfrogging each other in the most expensive race to the bottom of all time.
Good for tech, but a horrible business and financial picture for these companies.
They’re absolutely going to get bailed out and socialize the losses somehow. They might just get a huge government contract instead of an explicit bailout, but they’ll weasel out of this one way or another and these huge circular deals are to ensure that.
>They’re absolutely going to get bailed out and socialize the losses somehow.
I've had that uneasy feeling for a while now. Just look at Jensen and Nvidia -- they're trying to get their hooks into every major critical sector as they're able to (Nokia last month, Synopsys just recently). When chickens come home to roost, my guess is that they'll pull out the "we're too big to fail, so bailout pls" card.
Crazy times. If only we had regulators with more spine.
Many retirement accounts/managers may already be channeling investment such that 401k accounts are broadly set up to absorb any losses… Could also just be this large piece of tin foil on my head.
Absolutely. And they will figure out how to bankrupt any utilities and local governments they can in the process by offloading as much of their costs overhead for power generation and shopping for tax rebates.
It will be the biggest bailout in history and financed entirely by money printing at a time when the stability of the dollar is already being questioned, right? Not good.
Absolutely. I don't understand why investors are excited about getting into a negative-margin commodity. It makes zero sense.
I was an OpenAI fan from GPT 3 to 4, but then Claude pulled ahead. Now Gemini is great as well, especially at analyzing long documents or entire codebases. I use a combination of all three (OpenAI, Anthropic & Google) with absolutely zero loyalty.
I think the AGI true believers see it as a winner-takes-all market as soon as someone hits the magical AGI threshold, but I'm not convinced. It sounds like the nuclear lobby's claims that they would make electricity "too cheap to meter."
It's the same reason for investing in every net-loss high-valuation tech startup of the past decade. They're hoping they'll magically turn into Google, Apple, Netflix, or some other wealthy tech company. But they forget that Google owns the ad market, Apple owns the high-end/lifestyle computer market, and Netflix owns tv/movie habit analytics.
Investors in AI just don't realize AI is a commodity. The AI companies' lies aren't helping (we will not reach AGI in our lifetimes). The bubble will burst if investors figure this out before they successfully pivot (and they're trying damn hard to pivot).
> I don't understand why investors are excited about getting into a negative-margin commodity. It makes zero sense.
Long term, yes. But Wall Street does not think long term. Short or medium term, you just need to cash out to the next sucker in line before the bubble pops, and there are fortunes to be made!
Maybe there's no tangible moat still, but did Gemini 3's exceptional performance actually funnel users away from ChatGPT? The typical Hacker News reader might be aware of its good performance on benchmarks, but did this convert a significant number of ChatGPT users to Gemini? It's not obvious to me either way.
Definitely. The fact that they inject it into Google Search means that even fewer people who have never used ChatGPT or just used it as a "smarter" Google search will just directly try the search function. It is terrible for actually detailed information i.e. debugging errors, but for summarizing basic searches that would have taken 2-3 clicks on the results is handled directly after the search. I feel bad for the website hosts who actually want visitors instead of visibility.
Anecdotally yes. Since launch I’ve observed probably 50% of the folks that were “ChatGPT those that” all the time suddenly talking about Gemini non-stop. The more that gets rolled into Google’s platform the more there’s point to using separate tooling from OpenAI. There’s a reason Sam is calling this “code red.”
Especially if we're approaching a plateau, in a couple years there could be a dozen equally capable systems. It'll be interesting to see what the differentiators turn out to be.
As history in tech shows you don’t need everyone copying you to be in big trouble, just one or two well positioned players. Typically that has been a big established player just adding your “special sauce product” as a feature to an existing well established product. Thats exactly what’s playing out now and why OpenAI is starting to panic as thy know how that movie typically ends.
(My apologies if this was already asked - this thread is huge and Find-In-Page-ing for variations of "pre-train", "pretrain", and "train" turned up nothing about this. If this was already asked I'd super-appreciate a pointer to the discussion :) )
Genuine question: How is it possible for OpenAI to NOT successfully pre-train a model?
I understand it's very difficult, but they've already successfully done this and they have a ton of incredibly skilled and knowledgeable, well-paid and highly knowledgeable employees.
I get that there's some randomness involved but it seems like they should be able to (at a minimum) just re-run the pre-training from 2024, yes?
Maybe the process is more ad-hoc (and less reproducible?) than I'm assuming? Is the newer data causing problems for the process that worked in 2024?
Any thoughts or ideas are appreciated, and apologies again if this was asked already!
> Genuine question: How is it possible for OpenAI to NOT successfully pre-train a model?
The same way everyone else fails at it.
Change some hyper parameters to match the new hardware (more params), maybe implement the latest improvements in papers after it was validated in a smaller model run. Start training the big boy, loss looks good, 2 months and millions of dollars later loss plateaus, do the whole SFT/RL shebang, run benchmarks.
It's not much better than the previous model, very tiny improvements, oops.
add to it multiple iterations of having to restart pretraining from some earlier checkpoint when loss plateaus too early or starts increasing due to some bugs…
I’m not sure what ‘successfully’ means in this context. If it means training a model that is noticeably better than previous models, it’s not hard to see how that is challenging.
You don't train the next model by starting with the previous one.
A company's ML researchers are constantly improving model architecture. When it's time to train the next model, the "best" architecture is totally different from the last one. So you have to train from scratch (mostly... you can keep some small stuff like the embeddings).
The implication here is that they screwed up bigly on the model architecture, and the end result was significantly worse than the mid-2024 model, so they didn't deploy it.
I can not say how big ML companies do it, but from personal experience of training vision models, you can absolutely reuse the weights of barely related architectures (add more layers, switch between different normalization layers, switch between separable/full convolution, change activation functions, etc.). Even if the shapes of the weights do not match, just do what you have to do to make them fit (repeat or crop). Of course the models will not work right away, but training will go much faster. I usually get over 10 times faster convergence that way.
Huh - I did not know that, and that makes a lot of sense.
I guess "Start software Vnext off the current version (or something pretty close)" is such a baseline assumption of mine that it didn't occur to me that they'd be basically starting over each time.
> the company will be delaying initiatives like ads, shopping and health agents, and a personal assistant, Pulse, to focus on improving ChatGPT
There's maybe like a few hundred people in the industry who can truly do original work on fundamentally improving a bleeding-edge LLM like ChatGPT, and a whole bunch of people who can do work on ads and shopping. One doesn't seem to get in the way of the other.
I think it's a matter of public perception and user sentiment. You don't want to shove ads into a product that people are already complaining about. And you don't want the media asking questions like why you rolled out a "health assistant" at the same time you were scrambling to address major safety, reliability, and legal challenges.
Far be it from me to backseat drive for Sam Altman, but is the problem really that the core product needs improvement, or that it needs a better ecosystem? I can't imagine people are choosing they're chatbots based on providing the perfect answers, it's what you can do with it. I would assume google has the advantage because it's built into a tool people already use every day, not because it's nominally "better" at generating text. Didn't people prefer chatgpt 4 to 5 anyways?
ChatGPT's thing always seems to have been to be the best LLM, hence the most users without much advertising and the most investment money to support their dominance. If they drop to second or third best it may cause them problems because they rely on investor money to pay the rather large bills.
Currently they are not #1 in any of the categories on LLM arena, and even on user numbers where they have dominated, Google is catching up, 650m monthly for Gemini, 800m for ChatGPT.
There are two layers here: 1) low level LLM architecture 2) applying low level LLM architecture in novel ways. It is true that there are maybe a couple hundred people who can make significant advances on layer 1, but layer 2 constantly drives progress on whatever level of capability layer 1 is at, and it depends mostly on broad and diverse subject matter expertise, and doesn't require any low level ability to implement or improve on LLM architectures, only understanding how to apply them more effectively in new fields. The real key thing is finding ways to create automated validation systems, similar to what is possible for coding, that can be used to create synthetic datasets for reinforcement learning. Layer 2 capabilities do feed back into improved core models, even if you have the same core architecture, because you are generating more and improved data for retraining.
ha what an incredible consumer-friendly outcome! Hopefully competition keeps the focus on improving models and prevents irritating kinds of monetization
If they don't start on ads and shopping, they're going to go out of business.
I'd rather a product that exists with ads, over one that's disappeared.
The fact is, personal subscriptions don't cover the bills if you're going to keep a free tier. Ads do. I don't like it any more than you do, but I'm a realist about it.
>There's maybe like a few hundred people in the industry
My guess is that it's smaller than that. Only a few people in the world are capable of pushing into the unknown and breaking new ground and discoveries
OpenAI has already lined up enormous long-term commitments — over $500 billion through initiatives like Stargate for U.S. data centers, $250 billion in spending on Microsoft Azure cloud services, and tens of billions on AMD’s plan to deliver 6 GW of Instinct GPUs.
Meanwhile, Oracle has financed its role in Stargate with at least $18 billion in corporate bonds plus another $9.6 billion in bank loans, and analysts expect its total capital need for these AI data centers could climb toward $100 billion.
The risk is straightforward: if OpenAI falls behind or can’t generate enough revenue to support these commitments, it would struggle to honor its long-term agreements. That failure would cascade. Oracle, for example, could be left with massive liabilities and no matching revenue stream, putting pressure on its ability to service the debt it already issued.
Given the scale and systemic importance of these projects — touching energy grids, semiconductor supply chains, and national competitiveness — it’s not hard to imagine a future where government intervention becomes necessary. Even though Altman insists he won’t seek a bailout, the incentives may shift if the alternative is a multi-company failure with national-security implications.
"Even though Altman insists he won’t seek a bailout"
No matter what Sam Altman's future plans are, the success of those future plans is entirely dependent on him communicating now that there is a 0% chance those future plans will include a bailout.
OpenAI doesn't have $500 billion in commitments lined up, it's promising to spend that much over 5 years... That's a helluva big difference than having $500B in revenue incoming.
Data centers take time to build. The capital investment to build these DCs is needed now in expectation that future revenue streams will pay for that capital.
> the incentives may shift if the alternative is a multi-company failure with national-security implications.
Sounds like a golden opportunity for GOOG to step over the corpse of OpenAI and take over for cents on the dollar all of the promises the now defunct ex-leader of AI made.
What about OpenAI would rate a bailout? There's too much competition. If they ever do end up in a deep hole and plead for a rescue, I would imagine that gov will just force a sale of assets. Surely Google, MS and Amazon can make use of their infrastructure in exchange for taking on some portion of their debts.
"it would struggle to honor its long-term agreements. That failure would cascade. Oracle, for example, could be left with massive liabilities and no matching revenue stream,"
No, there's a not of noise about this but these are just 'statements of intent'.
Oracle very intimately understands OpenAI's ability to pay.
They're not banking $50B in chips and then waking up naively one morning to find out OpenAI has no funding.
What will 'cascade' is maybe some sentiment, or analysts expectations etc.
Some of it, yes, will be a problem - but at this point, the data centre buildout is not an OpenAI driven bet - it's a horizontal be across tech.
There's not that much risk in OpenAI not raising enough to expand as much as it wants.
Frankly - a CAPEX slowdown will hit US GDP growth and freak people out more than anything.
This is all based on the LLM architecture that likely can't reach AGI.
If they aren't developing in parallel an alternative architecture than can reach AGI, when a/some companies develop such a new model, OpenAI are toast and all those juicy contracts are kaput.
Heard all the news how Gemini 3 is passing everyone on benchmarks, so quickly tested and still find it a far cry from ChatGPT in real world use when testing questions on both platforms. But importantly the ChatGPT app experience at least for iPhone/Mac users is drastically superior vs Google which feels very Google still. So Gemini would have to be drastically better answer wise than ChatGPT to lure users from a better UI/UX experience to Gemini. But glad to see competition since certainly don't want only one winner in this race.
That's really fascinating. Every real world use case I've tried on Gemini (especially math-related) absolutely slaughtered the performance of ChatGPT in speed and quality, not even close. As an Android user, the Gemini app is also far superior, since the ChatGPT app still doesn't properly display math equations, among plenty of other bugs.
I have to agree with you but I'll remain a skeptic until the preview tag is dropped. I found Gemini 2.5 Pro to be AMAZING during preview and then it's performance and quality unceremoniously dropped month after month once it went live. Optimizations in favor of speed/costs no doubt but it soured me on jumping ship during preview.
Anthropic pulled something similar with 3.6 initially, with a preview that had massive token output and then a real release with barely half -- which significantly curtails certain use cases.
That said, to-date, Gemini has outperformed GPT-5 and GPT5.1 on any task I've thrown at them together. Too bad Gemini CLI is still barely useful and prone to the same infinite loop issues that have plagued it for over a year.
I think Google has genuinely released a preview of a model that leapfrogs all other models. I want to see if that is what actually makes it to production before I change anything major in my workflows.
It's generally anecdotal and vibes when people make claims that some AI is better than another for things they do. There are too many variables and not enough eval for any of it to hold water imo. Personal preferences, experience, brand loyalty, and bias at play too
When I asked both ChatGPT 5.1 Extended Thinking and Gemini 3 Pro Preview High for best daily casual socks both responses were okay and had a lot of the same options, but while the ChatGPT response included pictures, specs scraped from the product pages and working links, the Gemini response had no links. After asking for links, Gemini gave me ONLY dead links.
That is a recurring experience, Gemini seems to be supremely lazy to its own detriment quite often.
A minute ago I asked for best CR2032 deal for Aqara sensors in Norway, and Gemini recommended the long discontinued IKEA option, because it didn't bother to check for updated information. ChatGPT on the other hand actually checked prices and stock status for all the options it gave me.
One might think that benchmarks do not say much about individual usage and that an objective assessment of the performance of AIs is difficult.
At least, thanks to the hype, RAM and SSDs are becoming more expensive, which eats up all the savings from using AI and the profits from increased productivity /s?
> But importantly the ChatGPT app experience at least for iPhone/Mac users is drastically superior vs Google which feels very Google still. So Gemini would have to be drastically better answer wise than ChatGPT to lure users from a better UI/UX experience to Gemini.
Yes, the ChatGPT experience is much better. No, Gemini doesn't need to make a better product to take market share.
I've never had the ChatGPT app. But my Android phone has the Gemini app. For free, I can do a lot with it. Granted, on my PC I do a lot more with all the models via paid API access - but on the phone the Gemini app is fine enough. I have nothing to gain by installing the ChatGPT app, even if it is objectively superior. Who wants to create another account?
And that'll be the case for most Android users. As a general hint: If someone uses ChatGPT but has no idea about gpt-4o vs gpt-5 vs gpt-5.1 etc, they'll do just fine with the Gemini app.
Now the Gemini app actually sucks in so many ways (it doesn't seem to save my chats). Google will fix all these issues, but can overtake ChatGPT even if they remain an inferior product.
It's Slack vs Teams all over again. Teams one by a large margin. And Teams still sucks!
Well I have been using Gemini and ChatGPT side by side for over 6 months now.
My experience is Gemini has significantly improved its UX and performs better that requires niche knowledge, think of some ancient gadgets that have been out of production for 4-5 decades. Gemini can produce reliable manuals, but ChatGPT hallucinates.
UX wise ChatGPT is still superior and for common queries it is still my go to. But for hard queries, I am team Gemini and it hasn’t failed me once
I've been a paying high volume user of ChatGPT for a while. I found the transition to Gemini to be seamless. I've been pleasantly surprised. I bounce between the two. I'm at about 60% Gemini, 40% ChatGPT.
I had a similar experience, signing up for the first time to give Gemini a test drive on my side project after a long time using ChatGPT. The latter has a native macOS client which "just works" integrating with Xcode buffers. I couldn't figure out how to integrate Gemini with Xcode quickly enough so I'm resorting to pasting back & forth from the browser. A few of the exchanges I've had "felt smarter" — but, on the whole, it feels like maybe it wasn't as well trained on Swift/SwiftUI as the OpenAI model. I haven't decided one way or another yet, but those are my initial impressions.
> But importantly the ChatGPT app experience at least for iPhone/Mac users is drastically superior vs Google which feels very Google still. So Gemini would have to be drastically better answer wise than ChatGPT to lure users from a better UI/UX experience to Gemini.
Opposite is true for a larger market. Gemini is great and available with one button click on most consumer phones. OpenAI will never crack most Android users by this logic of yours
Its really hard to measure these things. Personally I switched to Gemini a few months ago since it was half the cost of ChatGPT (Verizon has a $10/month Google AI package). I feel like I've subconsciously learned to prompt it slightly differently and now using OpenAI products feels disappointing. Gemini tends to give me the answer I expect, Claude follows close behind, I get "meh" results from OpenAI.
What are your primary usecases? Are you mostly using it as a chatbot?
I find gemini excels in multimodal areas over chatgpt and anthropic. For example, "identify and classify this image with meta data" or "ocr this document and output a similar structure in markdown"
Curiously, I had the opposite experience, except for Deep Research mode where after the latest update the OpenAI offering has become genuinely amazing. This is doubly ironic because Gemini has direct API access to Google search!
It is good, but Pro subscribers get only five per month. After that, it’s a limited version, and it’s not good (normal 5.1 gives more comprehensive answers than DR Limited).
they're deep into a redesign of the gemini app, idk when it will be released or if its going to be good, but at least they agree with you and are putting significant resources into fixing it.
I couldn't even get ChatGPT to let me download code it claimed to program for me. It kept saying the files were ready but refused to let me access or download anything. It was the most basic use case and it totally bombed. I gave up on ChatGPT right then and there.
It's amazing how different people have wildly varying experiences with the same product.
It's because comparing their "ChatGPT" experience with your "ChatGPT" experience doesn't tell anyone anything. Unless people start saying what models they're using and prompts, the discussions back and forth about what platform is the best provides zero information to anyone.
Did you wait a while before downloading? The links it provides for temporary projects have a surprisingly brief window where you can download them. I've had similar experience when even waiting 1 minute to download the file.
Since LLMs are non deterministic it's not that amazing. You could ask it the same question as me and we could both get very different conversations and experiences
This is exactly my experience. And it's funny -- this crowd is so skeptical of OpenAI... so they prefer _Google_ to not be evil? It's funny how heroes and villains are being re-cast.
Yeah, hate to say but for me a big thing is i still couldn't separate my Gemini chats into folders. I had ChatGPT export some profiles and history and moved it into Gemini, and 1) when Gemini gave me answers i was more pleased but 2) Gemini was a bit more rigorous on guard rails, which seems a bit overly cautious. I was asking some pretty basic non-controversial stuff.
https://newsletter.semianalysis.com/p/tpuv7-google-takes-a-s...
"OpenAI’s leading researchers have not completed a successful full-scale pre-training run that was broadly deployed for a new frontier model since GPT-4o in May 2024, highlighting the significant technical hurdle that Google’s TPU fleet has managed to overcome."
Given the overall quality of the article, that is an uncharacteristically convoluted sentence. At the risk of stating the obvious, "that was broadly deployed" (or not) is contingent on many factors, most of which are not of the GPU vs. TPU technical variety.
Deleted Comment
Their own press releases confirm this. They call 5 their best new "ai system", not a new model
https://openai.com/index/introducing-gpt-5/
It certainly was much dumber than 4o on Perplexity when I tried it.
Hardly a hot take. People have theorized about the ouroboros effect for years now. But I do wonder if that’s part of the problem
But I always realize it's just smoke and mirrors - the actual quality of the code and the failure modes and stuff are just so much worse than claude and gemini.
And I write some code for my personal enjoyment, and I gave it to Claude 6-8 months back for improvement, it gave me a massive change log and it was quite risky so abandoned it.
I tried this again with Gemini last week, I was more prepared and asked it to improve class by class, and for whatever reasons I got better answers -- changed code, with explanations, and when I asked it to split the refactor in smaller steps, it did so. Was a joy working on this over the thanksgiving holidays. It could break the changes in small pieces, talk through them as I evolved concepts learned previously, took my feedback and prioritization, and also gave me nuanced explanation of the business objectives I was trying to achieve.
This is not to downplay claude, that is just the sequence of events narration. So while it may or may not work well for experienced programmers, it is such a helpful tool for people who know the domain or the concepts (or both) and struggle with details, since the tool can iron out a lot of details for you.
My goal now is to have another project for winter holidays and then think through 4-6 hour AI assisted refactors over the weekends. Do note that this is a project of personal interest so not spending weekends for the big man.
So (again) we are just sharing anecdata
Somehow it doesn't get on my nerves (unlike Gemini with "Of course").
Interested, because I’ve been getting pretty good results with different tasks using the Codex.
The problem is that the "AI"s can cough up code examples based upon proprietary codebases that you, as an individual, have no access to. That creates a significant quality differential between coders who only use publicly available search (Google, Github, etc.) vs those who use "AI" systems.
Which makes sense for something that isn’t AI but LLM.
The 25x revenue multiple wouldn't be so bad if they weren't burning so much cash on R&D and if they actually had a moat.
Google caught up quick, the Chinese are spinning up open source models left and right, and the world really just isn't ready to adopt AI everywhere yet. We're in the premature/awkward phase.
They're just too early, and the AGI is just too far away.
Doesn't look like their "advertising" idea to increase revenue is working, either.
As a shady for-profit, there is none. That's the problem with this particular fraud.
Also their models get dumber and dumber over time.
https://platform.openai.com/docs/models/compare?model=gpt-5....
I followed him on Twitter. He said some very interesting things, I thought. Then he started talking about the niche of ML/AI I work near, and he was completely wrong about it. I became enlightened.
I didn't make this connection that the training data is that old, but that would indeed augur poorly.
Now I don't know if this means that OpenAI was able to add that 3 months of data to earlier models by tuning or if it was a "from scratch" pre-training run, but it has to be a substantial difference in the models.
Pre-training: You train on a vast amount of data, as varied and high quality as possible, this will determine the distribution the model can operate with, so LLMs are usually trained on a curated dataset of the whole internet, the output of the pre-training is usually called the base model.
Post-training: You narrow down the task by training on the specific model needs you want. You can do this through several ways:
- Supervised Finetuning (SFT): Training on a strict high quality dataset of the task you want. For example if you wanted a summarization model, you'd finetune the model on high quality text->summary pairs and the model would be able to summarize much better than the base model.
- Reinforcement Learning (RL): You train a separate model that ranks outputs, then use it to rate the output of the model, then use that data to train the model.
- Direct Preference Optimizaton (DPO): You have pairs of good/bad generations and use them to align the model towards/away the kinds of responses you want.
Post-training is what makes the models able to be easily used, the most common is instruction tuning that teaches to model to talk in turns, but post-training can be used for anything. E.g. if you want a translation model that always translates a certain way, or a model that knows how to use tools, etc. you'd achieve all that through post-training. Post-training is where most of the secret sauce in current models is nowadays.
https://youtu.be/3K-R4yVjJfU?si=JdVyYOlxUbEcvEEo&t=2624
> Q: Are the releases aligned with pre-training efforts?
> A: There used to be a time not that long ago, maybe half a year, distant past, where the models would align with RL runs or pretraining runs ... now the naming is by capability. GPT5 is a capable model; 5.1 is a more capable model
Helps you get useful answers like "I don't know that's too recent" when you ask questions like "who won the basketball game last night".
Deleted Comment
====
when have you been built?
I’m a GPT‑4‑based model that OpenAI released on March 14 2023. The underlying training data goes up to early 2023, and the model was trained in the months leading up to that release. If you’re asking about the ChatGPT product that ships the model to users, it went live in November 2022 and has since received updates (GPT‑3.5, GPT‑4, etc.) that keep it current.
====
But when supposedly running it from https://huggingface.co/chat/models/openai/gpt-oss-20b:
====
when have you been built?
I’m a language model created by OpenAI. The current generation (GPT‑4) that powers this chat was first released in March 2023 and has been updated and fine‑tuned up through the end of 2024. My training data runs up to the beginning of June 2025, so I’m built on knowledge available up to that point.
====
And that makes me thinking that although https://huggingface.co/chat claims to be using the models available to public at https://huggingface.co , it doesn't seems to be true and I raised this question here https://huggingface.co/ggml-org/gpt-oss-20b-GGUF/discussions... , https://github.com/huggingface/inference-playground/issues/1... and https://github.com/ggml-org/llama.cpp/discussions/15396#disc... .
One one side it's up against large competitors with an already established user base and product line that can simply bundle their AI offerings into those products. Google will do just what Microsoft did with Internet Explorer and bundle Gemini in for 'Free' with their already other profitable products and established ad-funded revenue streams.
At the same time, Deepseek/Qwen, etc. are open sourcing stuff to undercut them on the other side. It's a classic squeeze on their already fairly dubious business model.
OpenAI will top $20 billion in ARR this year, which certainly seems like significant revenue generation. [1]
[1] https://www.cnbc.com/2025/11/06/sam-altman-says-openai-will-...
fixed this for you
But now they've had an order of magnitude revenue growth. That can't still be consumer subscriptions, right? They've had to have saturated that?
I haven't seen reports of the revenue breakdown, but I imagine it must be enterprise sales.
If it's enterprise sales, I'd imagine that was sold to F500 companies in bulk during peak AI hype. Most of those integrations are probably of the "the CEO has tasked us with `implementing an AI strategy`" kind. If so, I can't imagine they will survive in the face of a recession or economic downturn. To be frank, most of those projects probably won't pan out even under the rosiest of economic pictures.
We just don't know how to apply AI to most enterprise automation tasks yet. We have a long way to go.
I'd be very curious to see what their revenue spread looks like today, because that will be indicative of future growth and the health of the company.
OpenAI is hemorrhaging cash at an astronomical rate.
Mozilla is a non-profit that is only sustained by the generous wealthy benefactor (Google) to give the illusion that there is competition in the browser market.
OpenAI is a non-profit funded by a generous wealthy benefactor (Microsoft).
Ideas of IPO and profitability are all just pipe dreams in Altmans imagination.
Deleted Comment
“will do”? Is there any Google product they haven't done that with already?
OpenAI should be looking at how Google built a moat around search. Anyone can write a Web crawler. Lots of people have. But no one else has turned search into the money printing machine that Google has. And they've used that to fund their search advantage.
I've long thought the moat-buster here will be China because they simply won't want the US to own this future. It's a national security issue. I see things like DeepSeek is moat-busting activity and I expect that to intensify.
Currently China can't buy the latest NVidia chips or ASML lithography equipment. Why? Because the US said so. I don't expect China to tolerate this long term and of any country, China has desmonstrated the long-term commitment to this kind of project.
"More access to Gemini 3 Pro, our most capable model More access to Deep Research in the Gemini app Video generation with limited access to Veo 3.1 Fast in the Gemini app More access to image generation with Nano Banana Pro Additional AI credits for video generation in Flow and Whisk Access Gemini directly in Google apps like Gmail and Docs" [Thanks but no thanks]
Feel like the end result would always be that while Google is slow to adjust, once they're in the race they're in it it.
On top of that the Chinese seem to be hellbent to destroy any possible moat the US companies might create by flooding the market with SOTA open-source models.
Although this tech might be good for software companies in general - it does reduce the main cost they have which is personnel. But in the long run Google will need to reinvent itself or die.
Same thing happen with Internet Explorer and Chrome, or going from Yahoo mail/Hotmail to Gmail.
Just some numbers to show what OpenAI is against:
So on one side you've got this behemoth with, compared to OpenAI's size, unlimited funding. The $25 bn per year OpenAI is after is basically a parking ticket for Google (only slightly exaggerating). Behemoth who came with Gemini 3 Pro "thinking" and Nano Banana (that name though) who are SOTA.And on the other side you've got the open-source weights you mentioned.
When OpenAI had its big moment HN was full of comments about how it was game over for Google for search was done for. Three years later and the best (arguably the best) model gives the best answer when you search... Using Google search.
Funny how these things turns out.
Google is atm the 3rd biggest cap in the world: only Apple and NVidia are slightly ahead. If Google is serious about its AI chips (and it looks like they are) and see the fuck-ups over fuck-ups by Apple, I wouldn't be surprised at all if Alphabet was to regain the number one spot.
That's the company OpenAI is fighting: a company that's already been the biggest cap in the entire world and that's probably going to regain that spot rather sooner than later and that happens to have crushed every single AI benchmark when Gemini 3 Pro came out.
I had a ChatGPT subscription. Now I'm using Gemini 3 Pro.
And great points on the Google history.. let's not forget they wrote the original Transformers paper after all
OpenAI has annualized revenue of $20bn. That's not Google, but it's not insignificant.
OpenAI has this amazing technology and a great app, but the company feels like some sort of financial engineering nightmare.
Given that we’re likely at peak AI hype at the moment they’re not well positioned at all to survive the coming “trough of disillusionment” that happens like clockwork on every hype cycle. Google, by comparison, is very well positioned to weather a coming storm.
In a year, when the economy might be in worse shape, they'll ask their team if the AI thing is working out.
What do you think happens to all the enterprise OpenAI contracts at that point? (Especially if the same tech layperson CEOs keep reading Forbes and hearing Scott Galloway dump on OpenAI and call the AI thing a "bubble"?)
Interestingly enough, apart from Google, I've never seen an organization take the actual proper steps (fire mid-management and PMs) to prevent the same thing from happening again. Will be interesting to see how OAI handles this.
Firing PMs and mid-management would not prevent any of code reds you may have read about from Google or OAI lately. This is a very naive perspective of how decision making is done at the scale of those two companies. I'm sorry you had bad experiences working with people in those positions and I wish you have the opportunity to collab with great ones in the future.
In theory, some engineers think they are perfectly capable of doing all the PMs work and all their own.
If they’ve never worked with a truly good PM, that’s a shame, they’d likely get more work done in a more timely fashion. I’ve worked with around 10 different PMs, the best kept stuff on track and aided with collaboration, reqs management, soft skills, handling tough customers, etc. they free up devs to do more dev work and less other work.
One time, in my entire career have I seen this done, and it is as successful as you imagine it to be. Lots of weird problems coming out from having done it, but those are being treated as "Wow we are so glad we know about this problem" rather than "I hope those idiots come back to keep pulling the wool over my eyes".
Why is the bar so low for the billionaire magnate fuck ups? Might as well implement workplace democracy and be done with it, it can't be any worse for the company and at least the workers understand what needs to be done.
But somehow, even in startups with short remaining runway, "code red" rarely means anything.
You still have to attend all the overhead meetings, run through approval circles, deal with HR etc etc.
And Microsoft gets the models for free (?)
Absent a major breakthrough all the major providers are just going to keep leapfrogging each other in the most expensive race to the bottom of all time.
Good for tech, but a horrible business and financial picture for these companies.
They’re absolutely going to get bailed out and socialize the losses somehow. They might just get a huge government contract instead of an explicit bailout, but they’ll weasel out of this one way or another and these huge circular deals are to ensure that.
I've had that uneasy feeling for a while now. Just look at Jensen and Nvidia -- they're trying to get their hooks into every major critical sector as they're able to (Nokia last month, Synopsys just recently). When chickens come home to roost, my guess is that they'll pull out the "we're too big to fail, so bailout pls" card.
Crazy times. If only we had regulators with more spine.
https://www.whitehouse.gov/presidential-actions/2025/08/demo...
Many retirement accounts/managers may already be channeling investment such that 401k accounts are broadly set up to absorb any losses… Could also just be this large piece of tin foil on my head.
I was an OpenAI fan from GPT 3 to 4, but then Claude pulled ahead. Now Gemini is great as well, especially at analyzing long documents or entire codebases. I use a combination of all three (OpenAI, Anthropic & Google) with absolutely zero loyalty.
I think the AGI true believers see it as a winner-takes-all market as soon as someone hits the magical AGI threshold, but I'm not convinced. It sounds like the nuclear lobby's claims that they would make electricity "too cheap to meter."
Investors in AI just don't realize AI is a commodity. The AI companies' lies aren't helping (we will not reach AGI in our lifetimes). The bubble will burst if investors figure this out before they successfully pivot (and they're trying damn hard to pivot).
There's a lot more than money at stake.
Long term, yes. But Wall Street does not think long term. Short or medium term, you just need to cash out to the next sucker in line before the bubble pops, and there are fortunes to be made!
Yes, companies like Google can catch up and overtake them, but a moat is merely making it hard and expensive.
99.999.. perc of companies can't dream of competing with OpenAI.
That’s not a bubble at all is it?
Genuine question: How is it possible for OpenAI to NOT successfully pre-train a model?
I understand it's very difficult, but they've already successfully done this and they have a ton of incredibly skilled and knowledgeable, well-paid and highly knowledgeable employees.
I get that there's some randomness involved but it seems like they should be able to (at a minimum) just re-run the pre-training from 2024, yes?
Maybe the process is more ad-hoc (and less reproducible?) than I'm assuming? Is the newer data causing problems for the process that worked in 2024?
Any thoughts or ideas are appreciated, and apologies again if this was asked already!
The same way everyone else fails at it.
Change some hyper parameters to match the new hardware (more params), maybe implement the latest improvements in papers after it was validated in a smaller model run. Start training the big boy, loss looks good, 2 months and millions of dollars later loss plateaus, do the whole SFT/RL shebang, run benchmarks.
It's not much better than the previous model, very tiny improvements, oops.
I can totally see how they're able to pre-train models no problem, but are having trouble with the "noticeably better" part.
Thanks!
A company's ML researchers are constantly improving model architecture. When it's time to train the next model, the "best" architecture is totally different from the last one. So you have to train from scratch (mostly... you can keep some small stuff like the embeddings).
The implication here is that they screwed up bigly on the model architecture, and the end result was significantly worse than the mid-2024 model, so they didn't deploy it.
I guess "Start software Vnext off the current version (or something pretty close)" is such a baseline assumption of mine that it didn't occur to me that they'd be basically starting over each time.
Thanks for posting this!
There's maybe like a few hundred people in the industry who can truly do original work on fundamentally improving a bleeding-edge LLM like ChatGPT, and a whole bunch of people who can do work on ads and shopping. One doesn't seem to get in the way of the other.
Currently they are not #1 in any of the categories on LLM arena, and even on user numbers where they have dominated, Google is catching up, 650m monthly for Gemini, 800m for ChatGPT.
Also Google/Hassabis don't show much sign of slacking off (https://youtu.be/rq-2i1blAlU?t=860)
Funnily enough Google had a "Chat Bot Is a ‘Code Red’ for Google’s Search Business" thing back in 2022 but seem to have got it together https://www.nytimes.com/2022/12/21/technology/ai-chatgpt-goo...
I'd rather a product that exists with ads, over one that's disappeared.
The fact is, personal subscriptions don't cover the bills if you're going to keep a free tier. Ads do. I don't like it any more than you do, but I'm a realist about it.
My guess is that it's smaller than that. Only a few people in the world are capable of pushing into the unknown and breaking new ground and discoveries
The risk is straightforward: if OpenAI falls behind or can’t generate enough revenue to support these commitments, it would struggle to honor its long-term agreements. That failure would cascade. Oracle, for example, could be left with massive liabilities and no matching revenue stream, putting pressure on its ability to service the debt it already issued.
Given the scale and systemic importance of these projects — touching energy grids, semiconductor supply chains, and national competitiveness — it’s not hard to imagine a future where government intervention becomes necessary. Even though Altman insists he won’t seek a bailout, the incentives may shift if the alternative is a multi-company failure with national-security implications.
No matter what Sam Altman's future plans are, the success of those future plans is entirely dependent on him communicating now that there is a 0% chance those future plans will include a bailout.
1. Government will "partner" (read: foot the bill) for these super-strategic datacenters and investments promised by OpenAI.
2. The investments are not actually sound and fail, but it's the taxpayer that suffers.
3. Mr. Altman rides off into the sunset.
Dead Comment
Sounds like a golden opportunity for GOOG to step over the corpse of OpenAI and take over for cents on the dollar all of the promises the now defunct ex-leader of AI made.
Skepticism is easy.
No, there's a not of noise about this but these are just 'statements of intent'.
Oracle very intimately understands OpenAI's ability to pay.
They're not banking $50B in chips and then waking up naively one morning to find out OpenAI has no funding.
What will 'cascade' is maybe some sentiment, or analysts expectations etc.
Some of it, yes, will be a problem - but at this point, the data centre buildout is not an OpenAI driven bet - it's a horizontal be across tech.
There's not that much risk in OpenAI not raising enough to expand as much as it wants.
Frankly - a CAPEX slowdown will hit US GDP growth and freak people out more than anything.
The cost of these data centers and ongoing inference is mostly the outrageous cost of GPUs, no?
I don't understand why the entire industry isn't looking to diversify the GPU constraint so that the hardware makers drop prices.
Why no industry initiative to break NVIDIA's strangehold and next TSMC's?
Or are GPUs a small line item in the outrageous spend companies like OpenAI are committing to?
If they aren't developing in parallel an alternative architecture than can reach AGI, when a/some companies develop such a new model, OpenAI are toast and all those juicy contracts are kaput.
Anthropic pulled something similar with 3.6 initially, with a preview that had massive token output and then a real release with barely half -- which significantly curtails certain use cases.
That said, to-date, Gemini has outperformed GPT-5 and GPT5.1 on any task I've thrown at them together. Too bad Gemini CLI is still barely useful and prone to the same infinite loop issues that have plagued it for over a year.
I think Google has genuinely released a preview of a model that leapfrogs all other models. I want to see if that is what actually makes it to production before I change anything major in my workflows.
it's contemporary vim vs emacs at this point
When I asked both ChatGPT 5.1 Extended Thinking and Gemini 3 Pro Preview High for best daily casual socks both responses were okay and had a lot of the same options, but while the ChatGPT response included pictures, specs scraped from the product pages and working links, the Gemini response had no links. After asking for links, Gemini gave me ONLY dead links.
That is a recurring experience, Gemini seems to be supremely lazy to its own detriment quite often.
A minute ago I asked for best CR2032 deal for Aqara sensors in Norway, and Gemini recommended the long discontinued IKEA option, because it didn't bother to check for updated information. ChatGPT on the other hand actually checked prices and stock status for all the options it gave me.
At least, thanks to the hype, RAM and SSDs are becoming more expensive, which eats up all the savings from using AI and the profits from increased productivity /s?
Yes, the ChatGPT experience is much better. No, Gemini doesn't need to make a better product to take market share.
I've never had the ChatGPT app. But my Android phone has the Gemini app. For free, I can do a lot with it. Granted, on my PC I do a lot more with all the models via paid API access - but on the phone the Gemini app is fine enough. I have nothing to gain by installing the ChatGPT app, even if it is objectively superior. Who wants to create another account?
And that'll be the case for most Android users. As a general hint: If someone uses ChatGPT but has no idea about gpt-4o vs gpt-5 vs gpt-5.1 etc, they'll do just fine with the Gemini app.
Now the Gemini app actually sucks in so many ways (it doesn't seem to save my chats). Google will fix all these issues, but can overtake ChatGPT even if they remain an inferior product.
It's Slack vs Teams all over again. Teams one by a large margin. And Teams still sucks!
My experience is Gemini has significantly improved its UX and performs better that requires niche knowledge, think of some ancient gadgets that have been out of production for 4-5 decades. Gemini can produce reliable manuals, but ChatGPT hallucinates.
UX wise ChatGPT is still superior and for common queries it is still my go to. But for hard queries, I am team Gemini and it hasn’t failed me once
Opposite is true for a larger market. Gemini is great and available with one button click on most consumer phones. OpenAI will never crack most Android users by this logic of yours
https://one.google.com/about/#compare-plans
or cheaper/free
I am using Gemini 3 Pro, I rarely use Flash.
I find gemini excels in multimodal areas over chatgpt and anthropic. For example, "identify and classify this image with meta data" or "ocr this document and output a similar structure in markdown"
It's amazing how different people have wildly varying experiences with the same product.
https://www.androidauthority.com/google-gemini-projects-2-36...
like it seems great, but then it's just bullshitting about what it can do or whatever