I’m genuinely confused about why people think Deepseeks results will mean fewer GPUs being needed in the future. DeepSeek won’t be top dog forever. At some point, all their big competitors will figure out how they created their model, copy the approach, and get the same efficiencies. After that, why wouldn’t every competitor add more compute to go beyond DeepSeek’s capabilities and each other? Is there some experimental evidence out there that having 10X or 100X the compute DeepSeek used for training wouldn’t result in a much more advanced model?
Cisco kept making and selling network hardware, and probably (citation needed) sold more from 2000-2006 than 1994-2000, but the stock trade was over. The web did become a serious thing, but only once people got broadband at home.
The Nvidia valuation was getting pretty weak. Lots of FAANGs with deep pockets started to invest in their own hardware, and it got good enough to start beating Nvidia. Intel and AMD are still out there and under pressure to capture at least some of the market. Then this came along and potentially upended the game, bringing costs down by orders of magnitude. It might not be true, and it might even drive up sales long-term, but for now, but the NVDA trade was always a short-term thing.
Most of the biggest Nvidia clients are valued on speculation of future revenue from their closed models (secret sauce). Deepseek is fully open source so those revenue expectations crashed and investors are having second thoughts on throwing more money at companies like OpenAI. And this hits the expected sales growth of Nvidia for the next few years.
Dark fibre eventually was used but it took many years. And it was bought for cheap by companies like Google and CloudFlare.
[1] https://en.wikipedia.org/wiki/Wavelength-division_multiplexi...
[2] https://en.wikipedia.org/wiki/Dark_fibre
[3] https://arxiv.org/html/2412.19437v1 (3.2.1DualPipe and Computation-Communication Overlap)
no joke, the hype around DWDM is why I got into networking -- waves are the future, man!
* DeepSeek appears to be credible evidence there may be clever optimizations to achieve higher model quality with less GPU cycles than previously thought. Basically, if you're making scarce oil derricks in a gasoline shortage and your stock price has been bid way up on the expectation of insatiable future gas demand, a more gas-efficient engine design is going to be adverse to your valuation. Especially if it's free and easy to implement.
* DeepSeek's weights are open source under a permissive license. Much of OpenAI (and similar company's) current revenue is from AI startups and other companies buying usage hours of proprietary leading edge models (eg O3) as cloud services through an API and reselling the output in their own applications targeting various verticals. If some of those companies start using a free open source model like DeepSeek (or it's future descendants/competitors) for some of their offerings - that'll reduce the income and war chest of some of today's biggest GPU buyers. Lower current revenue lowers valuations meaning the equity OpenAI et al use to buy GPUs will be devalued.
It's not just hardware though: you can't run CUDA on non-Nvidia hardware, which in my understanding is a major moat for Nvidia. I'd love to hear rebuttals on this though, because GPU programming is something I've only dabbled with.
From what I've read, most of the investments by FAANGs/startups in building specialised hardware has been in the inference space.
NVDA has been going up for the last 10 years (with 2022 being the only exception).
AI today is better than anyone could hope for, and I don’t see any reasons to not expect further advances.
I hope for an AI that can actually reason and doesn't bullshit its users though
Why do you except Google and Anthropic?
I agree with this in the sense that no model will be top dog forever. However, it's important to note their contributions to open source. They're raising the bottom bar, and that is important.
Many small companies, which would never think about training models in house, could now do it.
I see this will only boost the AI hardware market.
Deepseek's cheaper LLM services + providing open models for other hosts to provide
=> overall prices for using LLM services will fall due to competition (lower prices + more hosts entering the market); AI users won't pay so much for LLM services
=> LLM hosts/providers won't be able to project such high revenues or even purchase as many GPUs (and will receive less capital investment to buy GPUs since revenues per dollar invested are lower)
=> demand for and prices of Nvidia cards will fall
On the basis of this possible logic, portfolio managers and algorithms project lower growth/revenue for Nvidia and sell off its stock, setting off the usual chain reaction as other managers notice the downward price action and follow suit in order to stop further losses.
So noting your long term investment ideas seem plausible, what do you think is the immediate short term impact on this kind of spend? Do you think Nvidia will sell more or less units in the next reporting interval? Because thats what most people are reacting to.
It would not surprise me if there are plenty of willing buyers, looking to buy in a dip and sell on the inevitable upward swing.
I am not a direct investor. I have no idea what my pension fund did, if anything.
I am actually interested to see those who hold this view explain their logic here further. It's quite an interesting, if unorthodox take because we would think that algorithmic brilliance is not a moat, even when the code is not open source.
Deleted Comment