Readit News logoReadit News
Posted by u/prng2021 7 months ago
Ask HN: Confused about how DeepSeek hurts Nvidia
I’m genuinely confused about why people think Deepseeks results will mean fewer GPUs being needed in the future. DeepSeek won’t be top dog forever. At some point, all their big competitors will figure out how they created their model, copy the approach, and get the same efficiencies. After that, why wouldn’t every competitor add more compute to go beyond DeepSeek’s capabilities and each other? Is there some experimental evidence out there that having 10X or 100X the compute DeepSeek used for training wouldn’t result in a much more advanced model?
dehrmann · 7 months ago
It's a bit like "the Cisco moment" (and lots of people have been observing this). The company was building hardware needed for building out networks. The web looked like it was going to be the next big thing, and people couldn't get enough of CSCO. The web didn't pan out the way people hoped (or as quickly), and CSCO fell quickly.

Cisco kept making and selling network hardware, and probably (citation needed) sold more from 2000-2006 than 1994-2000, but the stock trade was over. The web did become a serious thing, but only once people got broadband at home.

The Nvidia valuation was getting pretty weak. Lots of FAANGs with deep pockets started to invest in their own hardware, and it got good enough to start beating Nvidia. Intel and AMD are still out there and under pressure to capture at least some of the market. Then this came along and potentially upended the game, bringing costs down by orders of magnitude. It might not be true, and it might even drive up sales long-term, but for now, but the NVDA trade was always a short-term thing.

alecco · 7 months ago
It's not a CISCO moment, more like a "Wavelength-division multiplexing" [1] moment of 2000 where the fiber optic craze was over capacity and crashed, causing a lot of "Dark Fibre" [2] left around. Deepseek found a way to squeeze more out of the same hardware, heck a key point is how to do more with the bandwidth bottleneck also with BI-DIRECTIONAL MULTIPLEXING [3] :)

Most of the biggest Nvidia clients are valued on speculation of future revenue from their closed models (secret sauce). Deepseek is fully open source so those revenue expectations crashed and investors are having second thoughts on throwing more money at companies like OpenAI. And this hits the expected sales growth of Nvidia for the next few years.

Dark fibre eventually was used but it took many years. And it was bought for cheap by companies like Google and CloudFlare.

[1] https://en.wikipedia.org/wiki/Wavelength-division_multiplexi...

[2] https://en.wikipedia.org/wiki/Dark_fibre

[3] https://arxiv.org/html/2412.19437v1 (3.2.1DualPipe and Computation-Communication Overlap)

savorypiano · 7 months ago
The key thing is that it's actually both, and the market is still in waiting to see mode even after a 20% dump. AI end applications are not growing fast enough, and compute costs may be crashing down.
red-iron-pine · 7 months ago
> more like a "Wavelength-division multiplexing"

no joke, the hype around DWDM is why I got into networking -- waves are the future, man!

mrandish · 7 months ago
While I agree with your points, I think a larger factor is that NVidia's valuation has been driven higher by the seemingly insatiable demand for data center AI GPUs from large companies investing far in advance (and excess) of near-term revenue. In recent months signs have emerged that leading edge models like O3 require significantly higher GPU cycles for each additional increment of quality. This would tend to push GPU demand growth rates even higher.

* DeepSeek appears to be credible evidence there may be clever optimizations to achieve higher model quality with less GPU cycles than previously thought. Basically, if you're making scarce oil derricks in a gasoline shortage and your stock price has been bid way up on the expectation of insatiable future gas demand, a more gas-efficient engine design is going to be adverse to your valuation. Especially if it's free and easy to implement.

* DeepSeek's weights are open source under a permissive license. Much of OpenAI (and similar company's) current revenue is from AI startups and other companies buying usage hours of proprietary leading edge models (eg O3) as cloud services through an API and reselling the output in their own applications targeting various verticals. If some of those companies start using a free open source model like DeepSeek (or it's future descendants/competitors) for some of their offerings - that'll reduce the income and war chest of some of today's biggest GPU buyers. Lower current revenue lowers valuations meaning the equity OpenAI et al use to buy GPUs will be devalued.

apeescape · 7 months ago
>Lots of FAANGs with deep pockets started to invest in their own hardware, and it got good enough to start beating Nvidia.

It's not just hardware though: you can't run CUDA on non-Nvidia hardware, which in my understanding is a major moat for Nvidia. I'd love to hear rebuttals on this though, because GPU programming is something I've only dabbled with.

shikharbhardwaj · 7 months ago
Agreed on CUDA being a big moat; Nvidia has spent a big chunk of the last 2 decades building this ecosystem and is now reaping dividends from it.

From what I've read, most of the investments by FAANGs/startups in building specialised hardware has been in the inference space.

kadushka · 7 months ago
NVDA trade was always a short-term thing

NVDA has been going up for the last 10 years (with 2022 being the only exception).

AI today is better than anyone could hope for, and I don’t see any reasons to not expect further advances.

Vampiero · 7 months ago
> AI today is better than anyone could hope for,

I hope for an AI that can actually reason and doesn't bullshit its users though

ein0p · 7 months ago
It doesn't. Inference is still expensive, and demand for it is high, as evidenced by Anthropic's frequent "we're out of quota" messages and Deepseek's crap-out under load last night. On the training side right now only the top flight labs can conduct serious, ambitious research, and even they don't do as much research as they'd like. Witness Meta effectively train the exact same architecture on similar data mixtures for the past couple of years. More or less the same situation is happening across the board - compute bandwidth (and therefore the ability to experiment) is scarce. What this means is inference will remain quite expensive in the foreseeable future, especially multimodal and long-context inference. Believe it or not, even Google is compute constrained. When I was there some days I couldn't even get a handful of TPUs to do my job - everything was allocated to training Gemini. Even if it gets a lot cheaper to train models, you could just train larger, more capable models and do more architectural / efficiency research, and iterate faster, with tremendous payback in the long run. NVIDIA is the only viable seller of shovels for this gold rush for everyone but Google and Anthropic. Bypassing the gatekeepers, and making capable AI models available to more people makes their product more valuable.
amgreg · 7 months ago
> NVIDIA is the only viable seller of shovels for this gold rush for everyone but Google and Anthropic.

Why do you except Google and Anthropic?

ashoeafoot · 7 months ago
Google makes its own hardware, they are vertical integrated .Dont know about Antrophic
mike_hearn · 7 months ago
Anthropic are the only (?) heavy users of Amazon's chips. Or maybe they aren't heavy users. It's hard to say, they use NVIDIA too. Amazon is a big investor.
accrual · 7 months ago
> DeepSeek won’t be top dog forever.

I agree with this in the sense that no model will be top dog forever. However, it's important to note their contributions to open source. They're raising the bottom bar, and that is important.

red-iron-pine · 7 months ago
yeah the FOSS angle is big. it's not just good, it's good and it's out there for anyone.
billconan · 7 months ago
I too can't understand why? won't a cheaper model make people use AI more? For example, the current $200 chatgpt plan is too expensive for me, but making it $4, I will become a customer.

Many small companies, which would never think about training models in house, could now do it.

I see this will only boost the AI hardware market.

creamyhorror · 7 months ago
Analysing the market and competitive situation:

Deepseek's cheaper LLM services + providing open models for other hosts to provide

=> overall prices for using LLM services will fall due to competition (lower prices + more hosts entering the market); AI users won't pay so much for LLM services

=> LLM hosts/providers won't be able to project such high revenues or even purchase as many GPUs (and will receive less capital investment to buy GPUs since revenues per dollar invested are lower)

=> demand for and prices of Nvidia cards will fall

On the basis of this possible logic, portfolio managers and algorithms project lower growth/revenue for Nvidia and sell off its stock, setting off the usual chain reaction as other managers notice the downward price action and follow suit in order to stop further losses.

ggm · 7 months ago
Market reactions go to immediacy usually. You don't see people stonking in or out of something for an outcome in 5 years time: It's immediacy which makes a wave happen.

So noting your long term investment ideas seem plausible, what do you think is the immediate short term impact on this kind of spend? Do you think Nvidia will sell more or less units in the next reporting interval? Because thats what most people are reacting to.

It would not surprise me if there are plenty of willing buyers, looking to buy in a dip and sell on the inevitable upward swing.

I am not a direct investor. I have no idea what my pension fund did, if anything.

nsoonhui · 7 months ago
There are people who are willing to say that deepseek has such a great team -- so great, in fact -- that they could still always crush competition absolutely all the time, despite having open source their models to some extent, if not all.

I am actually interested to see those who hold this view explain their logic here further. It's quite an interesting, if unorthodox take because we would think that algorithmic brilliance is not a moat, even when the code is not open source.

Deleted Comment