If training and inference just got 40x more efficient, but OpenAI and co. still have the same compute resources, once they’ve baked in all the DeepSeek improvements, we’re about to find out very quickly whether 40x the compute delivers 40x the performance / output quality, or if output quality has ceased to be compute-bound.
> If training and inference just got 40x more efficient
Did training and inference just get 40x more efficient, or just training? They trained a model with impressive outputs on a limited number of GPUs, but DeepSeek is still a big model that requires a lot of resources to run. Moreover, which costs more, training a model once or using it for inference across a hundred million people multiple times a day for a year? It was always the second one, and doing the training cheaper makes it even more so.
But this implies that we could use those same resources to train even bigger models, right? Except that you then have the same problem. You have a bigger model, maybe it's better, but if you've made inference cost linearly more because of the size and the size is now 40x bigger, you now need that much more compute for inference.
Actually inference got more efficient as well, thanks to the multi-head latent attention algorithm that compresses the key-value cache to drastically reduce memory usage.
In the long run (which in the AI world is probably ~1 year) this is very good for Nvidia, very good for the hyperscalers, and very good for anyone building AI applications.
The only thing it's not good for is the idea that OpenAI and/or Anthropic will eventually become profitable companies with market caps that exceed Apple's by orders of magnitude. Oh no, anyway.
Yes! I have had the exact same mental model. The biggest losers in this news are the groups building frontier models. They are the ones with huge valuations but if the optimizations becomes even close to true, its a massive threat to their business model. My feet are on the ground but I do still believe that the world does not comprehend how much compute it can use...as compute gets cheaper we will use more of it. Ignoring equity pricing, this benefits all other parties.
Can you guys explain what this would be bad for the OpenAI and Anthropic of the world?
Wasn't the story always outlined to be we build better and better models, then we eventually get to AGI, AGI works on building better and better models even faster, and we eventually get to super AGI, which can work on building better and better models even faster...
Isn't "super-optimization"(in the widest sense) what we expect to happen in the long run?
Yes, but I think most of the rout is caused by the fact that there really isn't anything protecting AI from being disrupted by a new player - They're fairly simple technology compared to some of the other things tech companies build. That means openai really doesn't have much ability to protect it's market leader status.
I don't really understand why the stock market has decided this affects nvidia's stock price though.
This article has good background, context, and explanations [1] They skipped CUDA and instead used PTX which is a lower level instruction set where they were able to implement more performant cross-chip comms to make up for the less-performant H800 chips.
>If training and inference just got 40x more efficient
The jury is still out on how much improvement DeepSeek made in terms of training and inference compute efficiency, but personally I think 10x is probably the actual improvement that's being made
But in business/engineering/manufacturing/etc if you have 10x more efficiency, you're basically going to obliterate the competitions.
>output quality has ceased to be compute-bound
You raised an interesting conjecture and it seems that it's very likely the case.
I know that it's not even a full two years that ChatGPT-4 has been released but it seems that it take OpenAI a very long time to release ChatGPT-5. Is it because they're taking their own sweet time to release the software not unlike GIMP, or they genuinely cannot justify the improvement to jump from 4 to 5? This stagnation however, has allowed others to catch up. Now based on DeekSeek claims, anyone can has their own ChatGPT-4 under their desk with Nvidia project Digits mini PCs [1]. For running DeepSeek, 4 units mini PCs will be more than enough of 4 PFLOPS and cost only USD12K. Let's say on average one subscriber user pays OpenAI monthly payment of USD$10, for 1000 persons organization it will be USD$10K, and the investment will pays for itself within a month, and no data ever leave the organization since it's a private cloud!
For training similar system to ChatGPT-4 based on DeepSeeks claims, a few millions USD$ is more than enough. Apparently, OpenAI, Softbank and Oracle just announced USD$500 Billions joint ventures to bring the AI forward with the new announced Stargate AI project but that's 10,000x money [2],[3]. But the elephant in the room question is that, can they even get 10x quality improvement of the existing ChatGPT-4? I really seriously doubt it.
[1] NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer’s Fingertips:
NVIDIA sells shovels to the gold rush. One miner (Liang Wenfeng), who has previously purchased at least 10,000 A100 shovels... has a "side project" where they figured out how to dig really well with a shovel and shared their secrets.
The gold rush, wether real or a bubble is still there! NVIDA will still sell every shovel they can manufacture, as soon as it is available in inventory.
Fortune 100 companies will still want the biggest toolshed to invent the next paradigm or to be the first to get to AGI.
Jevon's paradox would imply that there's good reason to think that demand for shovels will increase. AI doesn't seem to be one of those things where society as a whole will say, "we have enough of that; we don't need any more".
(Many individual people are already saying that, but they aren't the people buying the GPUs for this in the first place. Steam engines weren't universally popular either when they were introduced to society.)
I also dont get how this is bearish for NVDA. Before this, small to mid companies would give up on finetuning their own model because openai is just so much better and cheaper. Now deepseek SOTA model gives them much better quality baseline model to train on. Wouldn't more people want to RAG on top of deepseek? or some startups accountant would run the numbers and figures we can just inference the shit out of deepseek locally and in the long run we still come out ahead of using oenai api.
Either way that means a lot more NVDA hardware being sold. You still need CUDAs as rocm is still not there yet. In fact NVDA needs to churn out more CUDAs than ever.
Jevons was talking about coal as an input to commercial processes, for which there were other alternatives that competed on price (e.g. manual/animal labour). Whatever the process, it generated a return, it had utility, and it had scale.
I argue it doesn't apply to generative AI because its outputs are mostly no good, have no utility, or are good but only in limited commercial contexts.
In the first case, a machine that produces garbage faster and cheaper doesn't mean demand for the garbage will increase. And in the second case, there aren't enough buyers for high-quality computer-generated pictures of toilets to meaningfully boost demand for Nvidia's products.
The other thing is that if this pushes the envelope further on what AI models can do given a certain hardware budget, this might actually change minds. The pushback against generative AI today is that much of it is deployed in ways that are ultimately useless and annoying at best, and that in turn is because the capabilities of those models are vastly oversold (including internally in companies that ship products with them). But if a smarter model can actually e.g. reliably organize my email, that's a very different story.
When you get more marginal product from an input, it's expected you buy more of that input.
But at some point, if the marginal product gets high enough, the world needs not as many, because money spent on other inputs/factors pays off more.
This is a classic problem with extrapolation. Making people more efficient through the use of AI will tend to increase employment... until it doesn't and employment goes off a cliff. Getting more work done per unit of GPU will increase demand for GPUs ... until it doesn't, and GPU demand goes off the cliff.
It's always hard to tell where that cliff is, though.
What you are missing is that it turns out the gold isn’t actually gold. It’s bronze.
So earliest, the shovelers were willing to spend thousands of dollars for a single shovel because they were expecting to get much more valuable gold out the other end.
But now that it’s only bronze, they can’t spend that much money on their tools anymore to make their venture profitable. A lot of shovelers are gonna drop out of the race. And the ones that remain will not be willing to spend as much.
The fact that there isn’t that much money to be made in AI anymore means that whatever percentage of money would have gone to NVIDIA from the total money to be made in AI will now shrink dramatically.
The gold rush is over because pre-trained models don't improve as much anymore. The application layer has massive gains in cost-to-value performance. We also gain more trust from the consumer as models don't hallucinate as much. This is what DeepSeek R1 has shown us. As Ilya Sutskever said, pre-training is now over.
We now have very expensive Nvidia shovels that use a lot of power but do very little improvement to the models.
The thing with a gold rush is you often end up selling shovels after the gold has run out, but no one knows that until hindsight. There will probably be a couple scares that the gold has run out first to. And again the difference is only visible in hindsight.
Can anyone comment on why Wenfeng shared his secret sauce? Other than publicity, there only seems to be downsides for him, as now everyone else with larger compute will just copy and improve?
Well, American investors seem to be shaking in their boots and publicizing this attracts AI investments in China because it shows China can out/compete with the US in spite of the restrictions.
Yeah but NVIDIA's amazing digging technique that could only be accomplished with NVIDIA shovels is now irrelevant. Meaning there are more people selling shovels for the gold rush
DeepSeek's stuff is actually more dependent on nVidia shovels. They implemented a bunch of assembly-level optimizations below the CUDA stack that allowed them to efficiently use the H800s they have, which are memory-bandwidth-gimped vs. the H100s they can't easily buy on the open market. That's cool, but doesn't run on any other GPUs.
Cue all of China rushing to Jensen to buy all the H800s they can before the embargo gets tightened, now that their peers have demonstrated that they're useful for something.
At least briefly, Jensen's customer audience increased.
90% of the comments in this thread make it clear that knowing about technology does not in any way qualify someone to think correctly about markets and equity valuations.
With the crowdstrike outage earlier last year it was incredible how many hidden security and kernel "experts" came out crawling from the woodwork, questioning why anything needs to run in the kernel and predicting the company's demise.
The crash is absolutely rational; the cascading effect highlights the missing moat for companies like OpenAI. Without a moat, no investor will provide these companies with the billions that fueled most of the demand. This demand was essential for NVIDIA to squeeze such companies with incredible profit margins.
NVIDIA was overvalued before, and this correction is entirely justified. The larger impact of DeepSeek is more challenging to grasp. While companies like Google and Meta could benefit in the long term from this development, they still overpaid for an excessive number of GPUs. The rise in their stock prices was assumed to be driven by the moat they were expected to develop themselves.
I was always skeptical of those valuations. LLM inference was highly likely to become commoditized in the future anyway.
It has been clear for a while that one of two things is true.
1) AI stuff isn't really worth trillions, in which case Nvidia is overvalued.
2) AI stuff is really worth trillions, in which case there will be no moat, because you can cross any moat for that amount of money, e.g. you could recreate CUDA from scratch for far less than a trillion dollars and in fact Nvidia didn't spend anywhere near that much to create it to begin with. Someone else, or many someones, will spend the money to cross the moat and get their share.
So Nvidia is overvalued on the fundamentals. But is it overvalued on the hype cycle? Lots of people riding the bubble because number goes up until it doesn't, and you can lose money (opportunity cost) by selling too early just like you can lose money by selling too late.
Then events like this make some people skittish that they're going to sell too late, and number doesn't go up that day.
The crash of NVIDIA is not about the moat of OpenAI.
But because DeepSeek was able to cut training costs from billions to millions (and with even better performance). This means cheaper training but it also proves that OpenAI was not at the cutting edge of what was possible in training algorithms and that there are still huge gaps and disruptions possible in this area. So there is a lot less need to scale by pumping more and more GPUs but instead to invest in research that can cut down the cost. More gaps mean more possibility to cut costs and less of a need to buy GPUs to scale in terms of model quality.
For NVIDIA that means that all the GPUs of today are good enough for a long time and people will invest a lot less in them and a lot more in research like this to cut costs. (But I am sure they will be fine)
This is partially why Apple is the one that stands to gain more, and it showed.
Their "small models, on device" approach can only be perfected with something like DeepSeek, and they're not exposed to NVIDIA pricing, nor have to prove investors that their approach is still valid.
I don’t understand why this is not obvious to many people: tech and stock trading are totally two different things, why on earth a tech expert is expected to know trading at all? Imagining how ridiculous it would be if a computer science graduate will also automatically get a financial degree from college even though no financial class has been taken.
People developing statistical models that are excercising the financial market at scale are the quants. These people don't come from financial degree background.
I’ve noticed this phenomenon among IT & tech VC crowd. They will launch pod cast, offer expert opinion and what not on just about every topic under the Sun, from cold fusion to COVID vaccine to Ukraine war.
You wouldn’t see this in other folks, for example, a successful medical surgeon won’t offer much assertion about NVIDIA.
And the general tendency among audience is to assume that expertise can be carried across domains.
It's because software devs are smart and make a lot of money - a natural next step is to try and use their smarts to do something with that money. Hence stocks.
Tech people are allowed to quickly learn a domain enough to build the software that powers it, bringing in insights from other domains they've been across.
Just don't allow them to then comment on that domain with any degree of insight.
No, nvidia's demand and importance might reduce in the long term.
We are forgetting that China has a whole hardware ecosystem. Now we learn that building SOTA models does not need SOTA hardware in massive quanties from nvidia. So the crash in the market implicitly could mean that the (hardware) monopoly of American companies is not going to be more than a few years. The hardware moat is not as deep as the West thought.
Once China brings scale like it did to batteries, EVs, solar, infrastructure, drones (etc) they will be able to run and train their models on their own hardware. Probably some time away but less time than what Wall Street thought.
This is actually more about nvidia than about OpenAI. OpenAI owns the end interface and it will be generally safe (maybe at a smaller valuation). In the long term nvidia is more replaceable than you think it is. Inference is going to dominate the market -- its going to be cerebras, groq, amd, intel, nvidia, google TPUs, chinese TPUs etc.
On the training side, there will be less demand for nvidia GPUs as meta, google, microsoft etc. extract efficiencies with the GPUs they already have given the embarrasing success of DeepSeek. Now, China might have been another insatiable market for nvidia but the export controls have ensured that it wont be.
The increase in efficiency is usually accompanied with the process of commoditization as stuff get cheaper to develop, which is very bad news for nvidia.
If you dont need the super high end chips than Nvidia loses it's biggest moat and ability to monopolize the tech, CUDA isn't enough.
> I say DeepSeek should increase Nvidia’s demand due to Jevon’s Paradox.
If their claims were true, DeepSeek would increase the demand for GPU. It's so obvious that I don't know why we even need a name to describe this scenario (I guess Jeven's Paradox just sounds cool).
The only issue is that whether it would make a competitor to Nvidia viable. My bet is no, but the market seems to have betted yes.
> DeepSeek should increase Nvidia’s demand due to Jevon’s Paradox.
How exactly? From what I’ve read the full model can run on MacBook M1 sort of hardware just fine. And this is their first release, I’d expect it to get more efficient and maybe domain specific models can be run on much lower grade hardware sort of raspberry pi sort.
I agree but in the short/medium term, I think it will slow down because companies now will prefer to invest in research to optimize (training) costs rather than those very expensive GPUs. Only when the scientific community will reach the edge of what is possible in terms of optimization that it will be back at pumping GPUs like today. (Although small actors will continue to pump GPUs since they do not have the best talents to compete).
The other way is certainly also true. Your short piece is rational, but lacks insight into the inference and training dynamics of ML adoption unconstrained.
The rate of ML progress is spectacularly compute constrained today. Every step in today’s scaling program is setup to de-risked the next scale up, because the opportunity cost of compute is so high. If the opportunity cost of compute is not so high, you can skip the 1B to 8B scale ups and grid search data mixes and hyperparameters.
The market/concentration risk premium drove most of the volatility today. If it was truly value driven, then this should have happened 6 months ago when DeepSeek released V2 that had the vast majority of cost optimizations.
Cloud data center CapEx is backstopped by their growth outlook driven by the technology, not by GPU manufacturers. Dollars will shift just as quickly (like how Meta literally teared down a half built data center in 2023 to restart it to meet new designs).
I think it's entirely possible that one categorically can't think correctly about markets and equity valuations since they are vibes-based. Post hoc, sure, but not ahead of time.
Most people don't care about the fundamentals of equity valuations is the crux of it. If they can make money via derivatives, who cares about the underlying valuations? I mean just look at GME for one example, it's been mostly a squeeze driven play between speculators. And then you have the massive dispersion trade that's been happening on the SP500 over the last year+. And when most people invest in index funds, and index funds are weighted mostly by market cap, value investing has been essentially dead for a while now.
"Briefly stated, the Gell-Mann Amnesia effect is as follows. You open the newspaper to an article on some subject you know well. In Murray's case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward—reversing cause and effect. I call these the "wet streets cause rain" stories. Paper's full of them.
In any case, you read with exasperation or amusement the multiple errors in a story, and then turn the page to national or international affairs, and read as if the rest of the newspaper was somehow more accurate about Palestine than the baloney you just read. You turn the page, and forget what you know."
Because your comment was posted 9 hours ago, I have no idea what view you think is wrong. Could you explain what the incorrect view is and — ideally — what’s wrong with it?
That means the remaining 10% are similarly disillusioned by the impression Apple or AMD could "just write" a CUDA alternative and compete on their merits. You don't want either of those people spending their money on datacenter bets.
10 years ago people said OpenCL would break CUDA's moat, 5 years ago people said ASICs would beat CUDA, and now we're arguing that older Nvidia GPUs will make CUDA obsolete. I have spent the past decade reading delusional eulogies for Nvidia, and I still find people adamant they're doomed despite being unable to name a real CUDA alternative.
Did ASICs beat CUDA out in crypto coin mining? Not the benchmark I really care about, but if things slow down in AI (they probably won’t) ASICs could probably take over some of it.
Equity valuations for AI hardware future earnings changed dramatically in the last day. The belief that NVIDIA demand for their product is insatiable for the near future had been dented and the concern that energy is the biggest bottle neck might not be the case.
Lots to figure out on this information but the playbook radically changed.
Let's not forget, that also doesn't make you an expert in the history and society evolution, as we can all agree that a part of HN public still believe in “meritocracy” only and think that DEI programs are useless.
I find it interesting because the DeepSeek stuff, while very cool, doesn't seem invalidate that more compute wouldn't translate to even _higher_ capabilities?
It's amazing what they did with a limited budget, but instead of the takeaway being "we don't need that much compute to achieve X", it could also be, "These new results show that we can achieve even 1000*X with our currently planned compute buildout"
But perhaps the idea is more like: "We already have more AI capabilities than we know how to integrate into the economy for the time being" and if that's the hypothesis, then the availability of something this cheap would change the equation somewhat and possibly justify investing less money in more compute.
Probably not. If the price of Nvidia is dropping, it's because investors see a world where Nvidia hardware is less valuable, probably because it will be used less.
You can't do the distill/magnify cycle like you do with alphago. LLM models have basically stalled in their base capabilities, pre training is basically over at this point, so the news arms race will be over marginal capability gains and (mostly) making them cheaper and cheaper.
But inference time scaling, right?
A weak model can pretend to be a stronger model if you let it cook for a long time. But right now it looks like models as strong as what we have aren't going to be very useful even if you let them run for a long, long time. Basic logic problems still tank o3 if they're not a kind that it's seen before.
Basically, there doesn't seem to be a use case for big data centers that run small models for long periods of time, they are in a danger zone of both not doing anything interesting and taking way too long to do it.
The AI war is going to turn into a price war, by my estimations. The models will be around as strong as the ones we have, perhaps with one more crank of quality. Then comes the empty, meaningless battle of just providing that service for as close to free as possible.
If Openai's agents panned out we might be having another conversation. But they didn't, and it wasn't even close.
This is probably it. There's not much left in the AI game
Your implication is that we have unlimited compute and therefore know that LLMs are stalled.
Have you considered that compute might be the reason why LLMs are stalled at the moment?
What made LLMs possible in the first place? Right, compute! Transformer Model is 8 years old, technically GPT4 could have been released 5 years ago. What stopped it? Simple, the compute being way too low.
Nvidia has improved compute by 1000x in the past 8 years but what if training GPT5 takes 6-12 months for 1 run based on what OpenAI tries to do?
What we see right now is that pre-training has reached the limits of Hopper and Big Tech is waiting for Blackwell. Blackwell will easily be 10x faster in cluster training (don't look on chip performance only) and since Big Tech intends to build 10x larger GPU clusters then they will have 100x compute systems.
Let's see then how it turns out.
The limit on training is time. If you want to make something new and improve then you should limit training time because nobody will wait 5-6 months for results anymore.
It was fine for OpenAI years ago to take months to years for new frontier models. But today the expectations are higher.
There is a reason why Blackwell is fully sold out for the year. AI research is totally starved for compute.
The best thing for Nvidia is also that while AI research companies compete with each other, they all try to get Nvidia AI HW.
We don't know for example what a larger model can do with the new techniques DeepSeek is using for improving/refining it. It's possible the new models on their [own] failed to show progress but a combination of techniques will enable that barrier to be crossed.
We also don't know what the next discovery/breakthrough will be like. The reward for getting smarter AI is still huge and so the investment will likely remain huge for some time. If anything DeepSeek is showing us that there is still progress to be made.
The stock market is not the economy, Wall Street is not Main Street. You need to look at this more macroscopically if you want to understand this.
Basically: China tech sector just made a big splash, traders who witnessed this think other traders will sell because maybe US tech sector wasn't as hot, so they sell as other traders also think that and sell.
The fall will come to rest once stocks have fallen enough that traders stop thinking other traders will sell.
Investors holding for the long haul will see this fall as stocks going on sale and proceed to buy because they think other investors will buy.
Meanwhile in the real world, on Main Street, nothing has really changed.
Bogleheads meanwhile are just starting the day with their coffee, no damns given to the machinations of the stock market because it's Monday and there's work to be done.
Is it really related to China's tech sector as such, though? If this is true then Openai, Google or even many magnitudes smaller companies etc. can just easily replicate similar methods in their processes and provide models which are just as good or better. However they'll need way less Nvidia GPUs and other HW to do that than when training their current models.
> doesn't seem invalidate that more compute wouldn't translate to even _higher_ capabilities?
That's how i understand it.
And since their current goal seems to be 'AGI' and their current plan for achieving it seems to be scaling LLMs (network depth wise and at inference time prompt wise), i don't see why it wouldn't hold.
The biggest discussion I have been on having this is the implications on Deepseek for say the RoI H100. Will a sudden spike in available GPUs and reduction in demand (from efficient GPU usage) dramatically shock the cost per hour to rent a GPU. This I think is the critical value for measuring the investment value for Blackwell now.
The price for a H100 per hour has gone from the peak of $8.42 to about $1.80.
A H100 consumes 700W, lets say $0.10 per kwh?
A H100 costs around $30000.
Given deepseek, can the price of this drop further given a much larger supply of available GPUs can now be proven to be unlocked (Mi300x, H200s, H800s etc...).
Now that LLMs have effectively become commodity, with a significant price floor, is this new value ahead of what is profitable for the card.
Given the new Blackwell is $70000, is there sufficient applications that enable customers to get a RoI on the new card?
Am curious about this as I think I am currently ignorant of the types of applications that businesses can use to outweigh the costs. I predict that the cost per hour of the GPU dropping such that it isn't such a no-brainer investment compared to previously. Especially if it is now possible to unlock potential from much older platforms running at lower electricity rates.
Why is there this implicit assumption that more efficient training/inference will reduce GPU demand? It seems more likely - based on historical precedent in the computing industry - that demand will expand to fill the available hardware.
We can do more inference and more training on fewer GPUs. That doesn’t mean we need to stop buying GPUs. Unless people think we’re already doing the most training/inference we’ll ever need to do…
Historically most compute went to run games in peoples homes, because companies didn't see a need to run that much analytics. I don't see why that wouldn't happen now as well, there is a limit to how much value you can get out of this, since they aren't AGI yet.
Over the long run maybe, but for the next 2 years the market will struggle to find a use for all this possible extra gpus. There is no real consumer demand for AI products and lots of backlash whenever implemented eg: that Coca Cola ad. It's going to be a big hit to demand in the short to medium term as the hyperscalers cut back/reasses.
The part of this that doesn’t jibe with me is the fact that they also released this incredibly detailed technical report on their architecture and training strategy. The paper is well-written and has a lot of specifics. Exactly the opposite of what you would do if you had truly made an advancement of world-altering magnitude. All this says to me is that the models themselves have very little intrinsic value / are highly fungible. The true value lies in the software interfaces to the models, and the ability to make it easy to plug your data into the models.
My guess is the consumer market will ultimately be won by 2-3 players that make the best app / interface and leverage some kind of network effect, and enterprise market will just be captured by the people who have the enterprise data, I.e. MSFT, AMZN, GOOG. Depending on just how impactful AI can be for consumers, this could upend Apple if a full mobile hardware+OS redesign is able to create a step change in seamlessness of UI. That seems to me to be the biggest unknown now - how will hardware and devices adapt?
NVDA will still do quite well because as others have noted, if it’s cheaper to train, the balance will just shift toward deploying more edge devices for inference, which is necessary to realize the value built up in the bubble anyway. Some day the compute will become more fungible but the momentum behind the nvidia ecosystem is way too strong right now.
What has changed is the perception that people like OpenAI/MSFT would have an edge on the competition because of their huge datacenters full of NVDA hardware. That is no longer true. People now believe that you can build very capable AI applications for far less money. So the perception is that the big guys no longer have an edge.
Tesla had already proven that to be wrong. Tesla's Hardware 3 is a 6 year old design, and it does amazingly well on less than 300 watts. And that was mostly trained on a 8k cluster.
I mean, I think they still do have an edge - ChatGPT is a great app and has strong consumer recognition already, very hard to displace.. and MSFT has a major installed base of enterprise customers who cannot readily switch cloud / productivity suite providers. So I guess they still have an edge it’s just nore of a traditional edge.
> The part of this that doesn’t jibe with me is the fact that they also released this incredibly detailed technical report on their architecture and training strategy. The paper is well-written and has a lot of specifics. Exactly the opposite of what you would do if you had truly made an advancement of world-altering magnitude.
I disagree completely on this sentiment. This was in fact the trend for a century or more (see inventions ranging from the polio vaccine to "Attention is all you need" by Vaswani et. al.) before "Open"AI became the biggest player on the market due and Sam Altman tried to bag all the gains for himself. Hopefully, we can reverse course on this trend and go back to when world-changing innovations are shared openly so they can actually change the world.
Exactly. There's a strong case for being open about the advancements in AI. Secretive companies like Microsoft, OpenAI, and others are undercut by DeepSeek and any other company on the globe who wants to build on what they've published. Politically there are more reasons why China should not become the global center of AI and less reasons why the US should remain the center of it. Therefore, an approach that enables AI institutions worldwide makes more sense for China at this stage. The EU for example has even less reason now to form a dependency on OpenAI and Nvidia, which works to the advantage of China and Chinese AI companies.
I’m not arguing for/against the altruistic ideal of sharing technological advancements with society, I’m just saying that having a great model architecture is really not a defensible value proposition for a business. Maybe more accurate to say publishing everything in detail indicates that it’s likely not a defensible advancement, not that it isn’t significant.
I always thought AMZN is the winner since I looked into Bedrock. When I saw Claude on there it added a fuck yeah, and now the best models being open just takes it to another level.
AWS’s usual most doesn’t really apply here. AWS is Hotel California — if your business and data is in AWS, the cost of moving any data-intensive portion out of AWS is absurd due to egress fees. But LLM inference is not data-transfer intensive at all — a relatively small number of bytes/tokens go to the model, it does a lot of compute, and a relatively small number of tokens come back. So a business that’s stuck in AWS can cost-effectively outsource their LLM inference to a competitor without any substantial egress fees.
RAG is kind of an exception, but RAG still splits the database part from the inference part, and the inference part is what needs lots of inference-time compute. AWS may still have a strong moat for the compute needed to build an embedding database in the first place.
Simple, cheap, low-compute inference on large amounts of data is another exception, but this use will strongly favor the “cheap” part, which means there may not be as much money in it for AWS. No one is about to do o3-style inference on each of 1M old business records.
You are not taking into account why people are willing to pay exceedingly high prices for GPUs now and that the underlying reason may have been taken away.
Build trust by releasing your inferior product for free and as open as possible. Get attention, then release your superior product behind paywall. Name recognition is incredibly important within and outside of China.
Keep in mind, they’re still competing with Baidu, Tencent and other AI labs.
I feel like this is a symptom of our broken economic system that has allowed too much cash to be trapped in the markets, forever making mostly imaginary numbers go up while the middle class gets squeezed and the poor continue to suffer.
A fundamental feature of a capitalist system you can use money to make more money. That's great for growing wealth. But you have to be careful, it's like a sound system at a concert. When you install it everybody benefits from being able to hear the band. But it is easily to cause an earspittig feedback loop if you don't keep the singers a safe distance from the speakers. Unfortunately, the only way people have to quantify how good a concert sounds is by loudness, and because the awful screeching of a feedback loop is about the loudest thing possible we've been just holding the microphone at the speaker for close to 50 years and telling ourselves that everybody is enjoying the music.
It is the job of the government, because nobody else can do it, to prevent the runaway feedback loop that is a fundamental flaw of capitalism, and our government has been entirely derelict in their duty. This has caused market distortions that go beyond the stock market. The housing market is also suffering for example. There is way too much money at the top looking for anything that can create a return, and when something looks promising it gets inflated to ridiculous levels, far beyond what is helpful for a company trying to expand their business. There's so much money most of it has to be dumb money.
TINA. Government forced everyone to save for retirement this way. There’s way too much capital trapped in SPY and that’s going to create distortions in price discovery and these abrupt corrections in individual stocks.
Really it should be adjusted for global (or US) total market cap. Market cap tends to go up faster than inflation, so even if you adjust for inflation, it will still be skewed toward modern companies.
Nvidia has gotten lucky repeatedly. The GPUs were great for PC gaming and they were the top dog. The crypto boom was such an unexpected win for them partly because Intel killed off their competition by acquiring it. Then the AI boom is also a direct result of Intel killing off their competition but the acquisition is too far removed to credit it to that event.
Unlike the crypto boom though, two factors make me think the AI thing was bound to go away quickly.
Unlike crypto there is no mathematical lower bound for computation, and if you see technology's history we can tell the models are going to get better/smaller/faster overtime reducing our reliance on the GPU.
Crypto was fringe but AI is fundamental to every software stack and every company. There is way too much money in this to just let Nvidia take it all. One way or another the reliance on it will be reduced
> the models are going to get better/smaller/faster overtime reducing our reliance on the GPU
Yes, because we've seen that with other software. I no longer want a GPU for my computer because I play games from the 90s and the CPU has grown powerful enough to suffice... except that's not the case at all. Software grew in complexity and quality with available compute resources and we have no reason to think "AI" will be any different.
Are you satisfied with today's models and their inaccuracies and hallucinations? Why do you think we will solve those problems without more HW?
because that's what history shows us. back in the 90s, MPEG-1/2 took dedicated hardware expansion cards to handle the encoding because software was just too damn slow. eventually, CPUs caught up, and dedicated instructions were added to the CPU to make software encoding multiple times faster than real-time. Then, H.264 came along and CPUs were slow for encoding again. Special instructions were added to the CPU again, and software encoding is multiple times faster again. We're now at H.265 and 8K video where encoding is slow on CPU. Can you guess what the next step will be?
Not all software is written badly where it becomes bloatware. Some people still squeeze everything they can, admittedly, the numbers are becoming smaller. Just like the quote, "why would I spend money to optimize Windows when hardware keeps improving" does seem to be group think now. If only more people gave a shit about their code vs meeting some bonus accomplishment
Honestly, you'd be shocked at how much gaming you can get done on the integrated gpus that are just shoved in these days. Sure, you won't be playing the most graphically demanding things, but think of platforms like the Switch, or games like Stardew. You can easily go without a dedicated GPU and still have a plethora of games.
And as for AI, there's probably so much room for improvement on the software side that it will probably be the case that the smarter, more performant AIs will not necessarily have to be on the top of the line hardware.
I think the point was not that we won't still use a lot of hardware, it's that it won't necessarily always be Nvidia. Nvidia got lucky when both crypto and AI arrived because it had the best available ready-made thing to do the job, but it's not like it's the best possible thing. Crypto eventually got its ASICs that made GPUs uncompetitive after all.
the aaa games industry is struggling (e.g. look at the profit warnings, share price drops and studio closures) specifically because people are doing that en masse.
but those 90s games are not old - retro has become a movement within gaming and there is a whole cottage industry of "indie" games building that aesthetic because it is cheap and fun.
Money isn’t fringe, and the target for crypto is all transactions, rather than the existing model where you pay between two and 3.5% to a card company or other middleman.
Credit card companies averaged over 22,000 transactions per second in 2023 without ever having to raise the fee. How many is crypto even capable of processing? Processing without the fee going up? What fraud protection guarantees are offered to the parties of crypto transactions?
If training and inference just got 40x more efficient, but OpenAI and co. still have the same compute resources, once they’ve baked in all the DeepSeek improvements, we’re about to find out very quickly whether 40x the compute delivers 40x the performance / output quality, or if output quality has ceased to be compute-bound.
Did training and inference just get 40x more efficient, or just training? They trained a model with impressive outputs on a limited number of GPUs, but DeepSeek is still a big model that requires a lot of resources to run. Moreover, which costs more, training a model once or using it for inference across a hundred million people multiple times a day for a year? It was always the second one, and doing the training cheaper makes it even more so.
But this implies that we could use those same resources to train even bigger models, right? Except that you then have the same problem. You have a bigger model, maybe it's better, but if you've made inference cost linearly more because of the size and the size is now 40x bigger, you now need that much more compute for inference.
https://mlnotes.substack.com/p/the-valleys-going-crazy-how-d...
https://en.m.wikipedia.org/wiki/Knowledge_distillation
I can run the largest model at 4 tokens per second on a 64GB card. Smaller models are _faster_ than Phi-4.
I've just switched to it for my local inference.
The only thing it's not good for is the idea that OpenAI and/or Anthropic will eventually become profitable companies with market caps that exceed Apple's by orders of magnitude. Oh no, anyway.
Wasn't the story always outlined to be we build better and better models, then we eventually get to AGI, AGI works on building better and better models even faster, and we eventually get to super AGI, which can work on building better and better models even faster... Isn't "super-optimization"(in the widest sense) what we expect to happen in the long run?
I don't really understand why the stock market has decided this affects nvidia's stock price though.
[1]: https://stratechery.com/2025/deepseek-faq/
they claim it's able to run models with 200B parameters on a single node and 400B when paired with another node
Deleted Comment
The jury is still out on how much improvement DeepSeek made in terms of training and inference compute efficiency, but personally I think 10x is probably the actual improvement that's being made
But in business/engineering/manufacturing/etc if you have 10x more efficiency, you're basically going to obliterate the competitions.
>output quality has ceased to be compute-bound
You raised an interesting conjecture and it seems that it's very likely the case.
I know that it's not even a full two years that ChatGPT-4 has been released but it seems that it take OpenAI a very long time to release ChatGPT-5. Is it because they're taking their own sweet time to release the software not unlike GIMP, or they genuinely cannot justify the improvement to jump from 4 to 5? This stagnation however, has allowed others to catch up. Now based on DeekSeek claims, anyone can has their own ChatGPT-4 under their desk with Nvidia project Digits mini PCs [1]. For running DeepSeek, 4 units mini PCs will be more than enough of 4 PFLOPS and cost only USD12K. Let's say on average one subscriber user pays OpenAI monthly payment of USD$10, for 1000 persons organization it will be USD$10K, and the investment will pays for itself within a month, and no data ever leave the organization since it's a private cloud!
For training similar system to ChatGPT-4 based on DeepSeeks claims, a few millions USD$ is more than enough. Apparently, OpenAI, Softbank and Oracle just announced USD$500 Billions joint ventures to bring the AI forward with the new announced Stargate AI project but that's 10,000x money [2],[3]. But the elephant in the room question is that, can they even get 10x quality improvement of the existing ChatGPT-4? I really seriously doubt it.
[1] NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer’s Fingertips:
https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...
[2] Trump unveils $500bn Stargate AI project between OpenAI, Oracle and SoftBank:
https://www.theguardian.com/us-news/2025/jan/21/trump-ai-joi...
[3] Announcing The Stargate Project:
https://openai.com/index/announcing-the-stargate-project/
The gold rush, wether real or a bubble is still there! NVIDA will still sell every shovel they can manufacture, as soon as it is available in inventory.
Fortune 100 companies will still want the biggest toolshed to invent the next paradigm or to be the first to get to AGI.
(Many individual people are already saying that, but they aren't the people buying the GPUs for this in the first place. Steam engines weren't universally popular either when they were introduced to society.)
Either way that means a lot more NVDA hardware being sold. You still need CUDAs as rocm is still not there yet. In fact NVDA needs to churn out more CUDAs than ever.
I argue it doesn't apply to generative AI because its outputs are mostly no good, have no utility, or are good but only in limited commercial contexts.
In the first case, a machine that produces garbage faster and cheaper doesn't mean demand for the garbage will increase. And in the second case, there aren't enough buyers for high-quality computer-generated pictures of toilets to meaningfully boost demand for Nvidia's products.
But at some point, if the marginal product gets high enough, the world needs not as many, because money spent on other inputs/factors pays off more.
This is a classic problem with extrapolation. Making people more efficient through the use of AI will tend to increase employment... until it doesn't and employment goes off a cliff. Getting more work done per unit of GPU will increase demand for GPUs ... until it doesn't, and GPU demand goes off the cliff.
It's always hard to tell where that cliff is, though.
If every well funded start-up can have a shot, then they buy more GPUs and the big players will need to buy even more to stay noticeably ahead.
Really? Has anyone made a useful, commercially successful product with it yet?
So earliest, the shovelers were willing to spend thousands of dollars for a single shovel because they were expecting to get much more valuable gold out the other end.
But now that it’s only bronze, they can’t spend that much money on their tools anymore to make their venture profitable. A lot of shovelers are gonna drop out of the race. And the ones that remain will not be willing to spend as much.
The fact that there isn’t that much money to be made in AI anymore means that whatever percentage of money would have gone to NVIDIA from the total money to be made in AI will now shrink dramatically.
The gold is still gold. We just thought it was 10,000 feet down, which requires lots of shovels, when it was actually just under the surface.
Nvidia's market cap is based on extreme margins and absurd growth for 10 years.
If either of those nobs get turned down a little, there can be a MASSIVE hit to the valuation - which is what happened.
We now have very expensive Nvidia shovels that use a lot of power but do very little improvement to the models.
Deleted Comment
Cue all of China rushing to Jensen to buy all the H800s they can before the embargo gets tightened, now that their peers have demonstrated that they're useful for something.
At least briefly, Jensen's customer audience increased.
NVIDIA was overvalued before, and this correction is entirely justified. The larger impact of DeepSeek is more challenging to grasp. While companies like Google and Meta could benefit in the long term from this development, they still overpaid for an excessive number of GPUs. The rise in their stock prices was assumed to be driven by the moat they were expected to develop themselves.
I was always skeptical of those valuations. LLM inference was highly likely to become commoditized in the future anyway.
1) AI stuff isn't really worth trillions, in which case Nvidia is overvalued.
2) AI stuff is really worth trillions, in which case there will be no moat, because you can cross any moat for that amount of money, e.g. you could recreate CUDA from scratch for far less than a trillion dollars and in fact Nvidia didn't spend anywhere near that much to create it to begin with. Someone else, or many someones, will spend the money to cross the moat and get their share.
So Nvidia is overvalued on the fundamentals. But is it overvalued on the hype cycle? Lots of people riding the bubble because number goes up until it doesn't, and you can lose money (opportunity cost) by selling too early just like you can lose money by selling too late.
Then events like this make some people skittish that they're going to sell too late, and number doesn't go up that day.
But because DeepSeek was able to cut training costs from billions to millions (and with even better performance). This means cheaper training but it also proves that OpenAI was not at the cutting edge of what was possible in training algorithms and that there are still huge gaps and disruptions possible in this area. So there is a lot less need to scale by pumping more and more GPUs but instead to invest in research that can cut down the cost. More gaps mean more possibility to cut costs and less of a need to buy GPUs to scale in terms of model quality.
For NVIDIA that means that all the GPUs of today are good enough for a long time and people will invest a lot less in them and a lot more in research like this to cut costs. (But I am sure they will be fine)
You wouldn’t see this in other folks, for example, a successful medical surgeon won’t offer much assertion about NVIDIA.
And the general tendency among audience is to assume that expertise can be carried across domains.
Just don't allow them to then comment on that domain with any degree of insight.
You say DeepSeek should decrease Nvidia demand. Wallstreet agreed today.
I say DeepSeek should increase Nvidia’s demand due to Jevon’s Paradox.
We are forgetting that China has a whole hardware ecosystem. Now we learn that building SOTA models does not need SOTA hardware in massive quanties from nvidia. So the crash in the market implicitly could mean that the (hardware) monopoly of American companies is not going to be more than a few years. The hardware moat is not as deep as the West thought.
Once China brings scale like it did to batteries, EVs, solar, infrastructure, drones (etc) they will be able to run and train their models on their own hardware. Probably some time away but less time than what Wall Street thought.
This is actually more about nvidia than about OpenAI. OpenAI owns the end interface and it will be generally safe (maybe at a smaller valuation). In the long term nvidia is more replaceable than you think it is. Inference is going to dominate the market -- its going to be cerebras, groq, amd, intel, nvidia, google TPUs, chinese TPUs etc.
On the training side, there will be less demand for nvidia GPUs as meta, google, microsoft etc. extract efficiencies with the GPUs they already have given the embarrasing success of DeepSeek. Now, China might have been another insatiable market for nvidia but the export controls have ensured that it wont be.
I've now seen this referenced two dozen times today which is well up from the 0 times I've seen it over the past year.
Is there some recent article referencing it that everyone is regurgitating?
If you dont need the super high end chips than Nvidia loses it's biggest moat and ability to monopolize the tech, CUDA isn't enough.
If their claims were true, DeepSeek would increase the demand for GPU. It's so obvious that I don't know why we even need a name to describe this scenario (I guess Jeven's Paradox just sounds cool).
The only issue is that whether it would make a competitor to Nvidia viable. My bet is no, but the market seems to have betted yes.
How exactly? From what I’ve read the full model can run on MacBook M1 sort of hardware just fine. And this is their first release, I’d expect it to get more efficient and maybe domain specific models can be run on much lower grade hardware sort of raspberry pi sort.
The rate of ML progress is spectacularly compute constrained today. Every step in today’s scaling program is setup to de-risked the next scale up, because the opportunity cost of compute is so high. If the opportunity cost of compute is not so high, you can skip the 1B to 8B scale ups and grid search data mixes and hyperparameters.
The market/concentration risk premium drove most of the volatility today. If it was truly value driven, then this should have happened 6 months ago when DeepSeek released V2 that had the vast majority of cost optimizations.
Cloud data center CapEx is backstopped by their growth outlook driven by the technology, not by GPU manufacturers. Dollars will shift just as quickly (like how Meta literally teared down a half built data center in 2023 to restart it to meet new designs).
https://x.com/doodlestein/status/1884712920543621148?s=46
In any case, you read with exasperation or amusement the multiple errors in a story, and then turn the page to national or international affairs, and read as if the rest of the newspaper was somehow more accurate about Palestine than the baloney you just read. You turn the page, and forget what you know."
– Michael Crichton (1942-2008)
10 years ago people said OpenCL would break CUDA's moat, 5 years ago people said ASICs would beat CUDA, and now we're arguing that older Nvidia GPUs will make CUDA obsolete. I have spent the past decade reading delusional eulogies for Nvidia, and I still find people adamant they're doomed despite being unable to name a real CUDA alternative.
Equity valuations for AI hardware future earnings changed dramatically in the last day. The belief that NVIDIA demand for their product is insatiable for the near future had been dented and the concern that energy is the biggest bottle neck might not be the case.
Lots to figure out on this information but the playbook radically changed.
An experienced person on how is challenging to implement this in the Tech world: https://news.ycombinator.com/item?id=42658998
Deleted Comment
Dead Comment
It's amazing what they did with a limited budget, but instead of the takeaway being "we don't need that much compute to achieve X", it could also be, "These new results show that we can achieve even 1000*X with our currently planned compute buildout"
But perhaps the idea is more like: "We already have more AI capabilities than we know how to integrate into the economy for the time being" and if that's the hypothesis, then the availability of something this cheap would change the equation somewhat and possibly justify investing less money in more compute.
You can't do the distill/magnify cycle like you do with alphago. LLM models have basically stalled in their base capabilities, pre training is basically over at this point, so the news arms race will be over marginal capability gains and (mostly) making them cheaper and cheaper.
But inference time scaling, right?
A weak model can pretend to be a stronger model if you let it cook for a long time. But right now it looks like models as strong as what we have aren't going to be very useful even if you let them run for a long, long time. Basic logic problems still tank o3 if they're not a kind that it's seen before.
Basically, there doesn't seem to be a use case for big data centers that run small models for long periods of time, they are in a danger zone of both not doing anything interesting and taking way too long to do it.
The AI war is going to turn into a price war, by my estimations. The models will be around as strong as the ones we have, perhaps with one more crank of quality. Then comes the empty, meaningless battle of just providing that service for as close to free as possible.
If Openai's agents panned out we might be having another conversation. But they didn't, and it wasn't even close.
This is probably it. There's not much left in the AI game
Have you considered that compute might be the reason why LLMs are stalled at the moment?
What made LLMs possible in the first place? Right, compute! Transformer Model is 8 years old, technically GPT4 could have been released 5 years ago. What stopped it? Simple, the compute being way too low.
Nvidia has improved compute by 1000x in the past 8 years but what if training GPT5 takes 6-12 months for 1 run based on what OpenAI tries to do?
What we see right now is that pre-training has reached the limits of Hopper and Big Tech is waiting for Blackwell. Blackwell will easily be 10x faster in cluster training (don't look on chip performance only) and since Big Tech intends to build 10x larger GPU clusters then they will have 100x compute systems.
Let's see then how it turns out.
The limit on training is time. If you want to make something new and improve then you should limit training time because nobody will wait 5-6 months for results anymore.
It was fine for OpenAI years ago to take months to years for new frontier models. But today the expectations are higher.
There is a reason why Blackwell is fully sold out for the year. AI research is totally starved for compute.
The best thing for Nvidia is also that while AI research companies compete with each other, they all try to get Nvidia AI HW.
We also don't know what the next discovery/breakthrough will be like. The reward for getting smarter AI is still huge and so the investment will likely remain huge for some time. If anything DeepSeek is showing us that there is still progress to be made.
are you sure? people are saying that there’s an analogous cycle where you use o1-style reasoning to produce better inputs to the next training round
Basically: China tech sector just made a big splash, traders who witnessed this think other traders will sell because maybe US tech sector wasn't as hot, so they sell as other traders also think that and sell.
The fall will come to rest once stocks have fallen enough that traders stop thinking other traders will sell.
Investors holding for the long haul will see this fall as stocks going on sale and proceed to buy because they think other investors will buy.
Meanwhile in the real world, on Main Street, nothing has really changed.
Bogleheads meanwhile are just starting the day with their coffee, no damns given to the machinations of the stock market because it's Monday and there's work to be done.
The Magnificent Seven are the only thing propping up the whole US economy.
If they go down, you go down.
That's how i understand it.
And since their current goal seems to be 'AGI' and their current plan for achieving it seems to be scaling LLMs (network depth wise and at inference time prompt wise), i don't see why it wouldn't hold.
The price for a H100 per hour has gone from the peak of $8.42 to about $1.80.
A H100 consumes 700W, lets say $0.10 per kwh?
A H100 costs around $30000.
Given deepseek, can the price of this drop further given a much larger supply of available GPUs can now be proven to be unlocked (Mi300x, H200s, H800s etc...).
Now that LLMs have effectively become commodity, with a significant price floor, is this new value ahead of what is profitable for the card.
Given the new Blackwell is $70000, is there sufficient applications that enable customers to get a RoI on the new card?
Am curious about this as I think I am currently ignorant of the types of applications that businesses can use to outweigh the costs. I predict that the cost per hour of the GPU dropping such that it isn't such a no-brainer investment compared to previously. Especially if it is now possible to unlock potential from much older platforms running at lower electricity rates.
We can do more inference and more training on fewer GPUs. That doesn’t mean we need to stop buying GPUs. Unless people think we’re already doing the most training/inference we’ll ever need to do…
“640KB ought to be enough for anybody.”
Inference demand might increase but you could easily believe that there’s substantial inelasticity currently.
My guess is the consumer market will ultimately be won by 2-3 players that make the best app / interface and leverage some kind of network effect, and enterprise market will just be captured by the people who have the enterprise data, I.e. MSFT, AMZN, GOOG. Depending on just how impactful AI can be for consumers, this could upend Apple if a full mobile hardware+OS redesign is able to create a step change in seamlessness of UI. That seems to me to be the biggest unknown now - how will hardware and devices adapt?
NVDA will still do quite well because as others have noted, if it’s cheaper to train, the balance will just shift toward deploying more edge devices for inference, which is necessary to realize the value built up in the bubble anyway. Some day the compute will become more fungible but the momentum behind the nvidia ecosystem is way too strong right now.
Tesla had already proven that to be wrong. Tesla's Hardware 3 is a 6 year old design, and it does amazingly well on less than 300 watts. And that was mostly trained on a 8k cluster.
I think what really happened is day to day trading noise. Nothing fundamentally changed, but traders believed other people believed it would.
I disagree completely on this sentiment. This was in fact the trend for a century or more (see inventions ranging from the polio vaccine to "Attention is all you need" by Vaswani et. al.) before "Open"AI became the biggest player on the market due and Sam Altman tried to bag all the gains for himself. Hopefully, we can reverse course on this trend and go back to when world-changing innovations are shared openly so they can actually change the world.
AMZN: no horse picked, we host anything
MSFT: Open AI
GOOGLE: Google AI
AMZN is in the strongest position.
RAG is kind of an exception, but RAG still splits the database part from the inference part, and the inference part is what needs lots of inference-time compute. AWS may still have a strong moat for the compute needed to build an embedding database in the first place.
Simple, cheap, low-compute inference on large amounts of data is another exception, but this use will strongly favor the “cheap” part, which means there may not be as much money in it for AWS. No one is about to do o3-style inference on each of 1M old business records.
Deleted Comment
Keep in mind, they’re still competing with Baidu, Tencent and other AI labs.
A fundamental feature of a capitalist system you can use money to make more money. That's great for growing wealth. But you have to be careful, it's like a sound system at a concert. When you install it everybody benefits from being able to hear the band. But it is easily to cause an earspittig feedback loop if you don't keep the singers a safe distance from the speakers. Unfortunately, the only way people have to quantify how good a concert sounds is by loudness, and because the awful screeching of a feedback loop is about the loudest thing possible we've been just holding the microphone at the speaker for close to 50 years and telling ourselves that everybody is enjoying the music.
It is the job of the government, because nobody else can do it, to prevent the runaway feedback loop that is a fundamental flaw of capitalism, and our government has been entirely derelict in their duty. This has caused market distortions that go beyond the stock market. The housing market is also suffering for example. There is way too much money at the top looking for anything that can create a return, and when something looks promising it gets inflated to ridiculous levels, far beyond what is helpful for a company trying to expand their business. There's so much money most of it has to be dumb money.
https://www.bloomberg.com/opinion/articles/2025-01-27/deepse...
Unlike the crypto boom though, two factors make me think the AI thing was bound to go away quickly.
Unlike crypto there is no mathematical lower bound for computation, and if you see technology's history we can tell the models are going to get better/smaller/faster overtime reducing our reliance on the GPU.
Crypto was fringe but AI is fundamental to every software stack and every company. There is way too much money in this to just let Nvidia take it all. One way or another the reliance on it will be reduced
Yes, because we've seen that with other software. I no longer want a GPU for my computer because I play games from the 90s and the CPU has grown powerful enough to suffice... except that's not the case at all. Software grew in complexity and quality with available compute resources and we have no reason to think "AI" will be any different.
Are you satisfied with today's models and their inaccuracies and hallucinations? Why do you think we will solve those problems without more HW?
Not all software is written badly where it becomes bloatware. Some people still squeeze everything they can, admittedly, the numbers are becoming smaller. Just like the quote, "why would I spend money to optimize Windows when hardware keeps improving" does seem to be group think now. If only more people gave a shit about their code vs meeting some bonus accomplishment
And as for AI, there's probably so much room for improvement on the software side that it will probably be the case that the smarter, more performant AIs will not necessarily have to be on the top of the line hardware.
the aaa games industry is struggling (e.g. look at the profit warnings, share price drops and studio closures) specifically because people are doing that en masse.
but those 90s games are not old - retro has become a movement within gaming and there is a whole cottage industry of "indie" games building that aesthetic because it is cheap and fun.
this is what I feel, but is there any scientific proof on that?