The title of the article is "DeepSeek R2 launch stalled as CEO balks at progress" but the body of the article says launch stalled because there is a lack of GPU capacity due to export restrictions, not because a lack of progress. The body does not even mention the word "progress".
I can't imagine demand would be greater for R2 than for R1 unless it was a major leap ahead. Maybe R2 is going to be a larger/less performant/more expensive model?
Deepseek could deploy in a US or EU datacenter ... but that would be admitting defeat.
>June 26 (Reuters) - Chinese AI startup DeepSeek has not yet determined the timing of the release of its R2 model as CEO Liang Wenfeng is not satisfied with its performance,
>Over the past several months, DeepSeek's engineers have been working to refine R2 until Liang gives the green light for release, according to The Information.
But yes, it is strange how the majority of the article is about lack of GPUs.
I am pretty sure that the information has no access to / sources at Deepseek. At most they are basing their article on selective random internet chatter amongst those who follow Chinese ai.
It's not about people wanting to keep it in moats.
It's about China being expansionist, actively preparing to invade Taiwan, and generally becoming an increasing military threat that does not respect the national integrity of other states.
The US is fine with other countries having AI if the countries "play nice" with others. Nobody is limiting GPU's in France or Thailand.
This is very specific to China's behavior and stated goals.
but deepseek doesn't actually need to host inference right if they opensource it? I don't see why these companies even bother to host inference. deepseek doesn't need outreach (everyone knows about them) and the huge demand for sota will force western companies to host them anyway.
Releasing the model has paid off handsomely with name recognition and making a significant geopolitical and cultural statement.
But will they keep releasing the weights or do an OpenAI and come up with a reason they can't release them anymore?
At the end of the day, even if they release the weights, they probably want to make money and leverage the brand by hosting the model API and the consumer mobile app.
I am a bit sceptical about whether this whole thing is true at all. This article links to another, which happens to be behind a paywall, saying 'GPU export sanctions are working' is a message a lot of US administration, people and investors want to hear, so I think there's a good chance that unsubstantiated speculation and wishful thinking is presented as fact here.
Given that DeepSeek is used by the Chinese military, I doubt that it would be a reasonable move for them to host in the U.S., because the capability is about more than profit.
The lack of GPU capacity sounds like bullshit though, and it's unsourced. It's not like you can't offer it as a secondary thing, sort of like O-3 or even just turning on the reasoning.
They just recently released the r1-0528 model which was a massive upgrade over the original R1 and is roughly on par with the current best proprietary western models. Let them take their time on R2.
At this point the only models I use are o3/o3-pro and R1-0528. The OpenAI model is better at handling data and drawing inferences, whereas the DeepSeek model is better at handling text as a thing in itself -- i.e. for all writing and editing tasks.
With this combo, I have no reason to use Claude/Gemini for anything.
People don't realize how good the new Deepseek model is.
My experience with R1-0528 for python code generation was awful. But I was using a context length of 100k tokens, so that might be why. It scores decently in the lmarena code leaderboard, where context length is short.
A lemon is on-par with the best western models for the majority of use cases because they do not require "state of the art" intelligence to solve or respond to the user's query. This is what the benchmarks show.
For anything that requires "AI level of intelligence", the difference is vast.
So Nvidia stock is going to crash hard when the Chinese inevitably produce their own competitive chip. Though I’m baffled by the fact they don’t just license and pump out billions of AMD chips. Nvidia is ahead, but not that far ahead.
My consumer AMD card (7900 XTX) outperforms the 15x more expensive Nvidia server chip (L40S) that I was using.
China doesn't want Taiwan for the chip making plants, but because they consider its existence to be an ongoing armed rebellion against the "rightful" rulers. Getting the fabs intact would be nice, but it's not the main objective.
The USA doesn't want to lose Taiwan because of the chip making plants, and a little bit because it is beneficial to surround their geopolitical enemies with a giant ring of allies.
A problem they face in building their own capacity is that ASML isn't allowed to export their newest machines to China. The US has even pressured them to stop servicing some machines already in China. They've been working on getting their own ASML competitor for decades, but so far unsuccessfully.
US will intervene militarily to stop China from taking control of TSMC if Taiwan isn't pressured by US to destroy the plants themselves, so I don't think taking Taiwan is a viable path to leading in silica only lowering US ability but given the current gap in GPUs it's not clear how helpful this is to China. So all in all I don't think China views taking Taiwan as beneficial in the AI race at all.
China will not go to war in or over Taiwan sort of the USA doing its common narcissistic psychopathic thing of instigating, orchestrating, and agitating for aggression. It seems though that some parts of the world have started understanding how to defuse and counter the narcissistic, psychopathic abusive cabal that controls the USA and is constantly agitating for war, destruction, and world domination due to some schizophrenic messianic self-fulfilling prophecies.
I wonder how different things would be if the CPU and GPU supply chain was more distributed globally: if we were at a point where we'd have models (edit: of hardware, my bad on the wording) developed and produced in the EU, as well as other parts of the world.
Maybe then we wouldn't be beholden to Nvidia's whims (sour spot in regards to buying their cards and the costs of those, vs what Intel is trying to do with their Pro cards but inevitably worse software support, as well as import costs), or those of a particular government. I wonder if we'll ever live in such a world.
> if we were at a point where we'd have models developed and produced in the EU, as well as other parts of the world.
But we have models developing and being produced outside of the US already, both in Asia but also Europe. Sure, it would be cool to see more from South America and Africa, but the playing field is not just in the US anymore, particularly when it comes to open weights (which seems more of a "world benefit" than closed APIs), then the US is lagging far behind.
> The Information reported on Thursday, citing two people with knowledge of the situation.
I miss the old days of journalism, when they might feel inclined to let the reader know that their source for the indirect source is almost entirely funded by the fortune generated by a man who worked slavishly to become a close friend of the boss of one of DeepSeek’s main competitors (Meta).
Feel bad for anyone who gets their news from The Information and doesn’t have this key bit of context.
I don't think it's well known that TI has FB CoIs. I didn't know myself until fairly recently. You can talk to a lot of people about this sort of stuff without anyone pointing it out.
Absolutely. I think the FB CoI within American venture capital, where they have fully infested the LP and GP ranks of most funds, is a much bigger and more important story. It really helped me understand that we need to work really hard to keep the rest of the world’s capital markets free and open — a major focus for me these days.
You never know which stories The Information won’t run, or which “negative” articles are actually deflections. Similarly, you never know which amazing startups remain shut out of funding, and a lot of entrepreneurs have no idea about the amount of back channel collusion goes on in creating the funding rounds and “overnight successes” they’re told to idolize.
A random dude on HN such as me shouldn’t be the source of this knowledge. Hope someone takes up the cause, but we live in a time of astounding cowardice.
I missed the old days of HN commenters when they might feel inclined to let the reader know who they are talking about without having to solve a 6 steps enigma.
Honestly, AI progress suffers because of these export restrictions. An open source model that can compete with Gemini Pro 2.5 and o3 is good for the world, and good for AI
Your views on this question are going to differ a lot depending on the probability you assign to a conflict with China in the next five years. I feel like that number should be offered up for scrutiny before a discussion on the cost vs benefits of export controls even starts.
I'm not American. Ever since I've been old enough to understand the world, the only country constantly at war everywhere is America. An all-powerful American AI is scarier to me than an open source Chinese one
> probability you assign to a conflict with China in the next five years. I feel like that number should be offered up for scrutiny before a discussion
Might as well talk about the probability of a conflict with South Africa, China might not be the best country to live in nor be country that takes care of its own citizens the best, but they seem non-violent towards other sovereign nations (so far), although of course there is a lot of posturing. But from the current "world powers", they seem to be the least violent.
Are you saying that large-model capabilities would make a substantial difference in a military conflict within the next five years? Because we aren’t seeing any signs of that in, say, the Ukraine war.
> Honestly, AI progress suffers because of these export restrictions. An open source model that can compete with Gemini Pro 2.5 and o3 is good for the world, and good for AI
DeepSeek is not a charity, they are the largest hedge fund in China, nothing different from a typical wall street funds. They don't spend billions to give the world something open and free just because it is good.
When the model is capable of generating decent amount of revenues, or when there is conclusive evidence of showing being closed would lead to much higher profit, it will be closed.
DeepSeek refuses to acknowledge Tiananmen Square. I don’t want to use a model that’s known to heavily censor historical data. What else is it denying or lying about that’s going to affect how I use it?
(In before “whatabout”: maybe US-made models do the same, but I’ve yet to hear reports of any anti-US information that they’re censoring.)
I can't imagine demand would be greater for R2 than for R1 unless it was a major leap ahead. Maybe R2 is going to be a larger/less performant/more expensive model?
Deepseek could deploy in a US or EU datacenter ... but that would be admitting defeat.
>June 26 (Reuters) - Chinese AI startup DeepSeek has not yet determined the timing of the release of its R2 model as CEO Liang Wenfeng is not satisfied with its performance,
>Over the past several months, DeepSeek's engineers have been working to refine R2 until Liang gives the green light for release, according to The Information.
But yes, it is strange how the majority of the article is about lack of GPUs.
Human progress that benefits everyone being stalled by the few and powerful who want to keep their moats. Sad world we live in.
It's about China being expansionist, actively preparing to invade Taiwan, and generally becoming an increasing military threat that does not respect the national integrity of other states.
The US is fine with other countries having AI if the countries "play nice" with others. Nobody is limiting GPU's in France or Thailand.
This is very specific to China's behavior and stated goals.
But will they keep releasing the weights or do an OpenAI and come up with a reason they can't release them anymore?
At the end of the day, even if they release the weights, they probably want to make money and leverage the brand by hosting the model API and the consumer mobile app.
Given that DeepSeek is used by the Chinese military, I doubt that it would be a reasonable move for them to host in the U.S., because the capability is about more than profit.
Deleted Comment
With this combo, I have no reason to use Claude/Gemini for anything.
People don't realize how good the new Deepseek model is.
For anything that requires "AI level of intelligence", the difference is vast.
My consumer AMD card (7900 XTX) outperforms the 15x more expensive Nvidia server chip (L40S) that I was using.
Surely it would be cheaper and easier for the CCP to develop their own chipmaking capacity than going to war in the Taiwan strait?
The USA doesn't want to lose Taiwan because of the chip making plants, and a little bit because it is beneficial to surround their geopolitical enemies with a giant ring of allies.
Maybe then we wouldn't be beholden to Nvidia's whims (sour spot in regards to buying their cards and the costs of those, vs what Intel is trying to do with their Pro cards but inevitably worse software support, as well as import costs), or those of a particular government. I wonder if we'll ever live in such a world.
But we have models developing and being produced outside of the US already, both in Asia but also Europe. Sure, it would be cool to see more from South America and Africa, but the playing field is not just in the US anymore, particularly when it comes to open weights (which seems more of a "world benefit" than closed APIs), then the US is lagging far behind.
Llama (v4 notwithstanding) and Gemma (particularly v3) aren't my idea of lagging far behind...
HGX 8x Nvidia H100 cluster for sale.
https://imgur.com/a/r6tBkN3
You can buy whatever you want. Export controls are basically fiction. Trying to stop global trade is like trying to stop a river with your bare hands.
I miss the old days of journalism, when they might feel inclined to let the reader know that their source for the indirect source is almost entirely funded by the fortune generated by a man who worked slavishly to become a close friend of the boss of one of DeepSeek’s main competitors (Meta).
Feel bad for anyone who gets their news from The Information and doesn’t have this key bit of context.
You never know which stories The Information won’t run, or which “negative” articles are actually deflections. Similarly, you never know which amazing startups remain shut out of funding, and a lot of entrepreneurs have no idea about the amount of back channel collusion goes on in creating the funding rounds and “overnight successes” they’re told to idolize.
A random dude on HN such as me shouldn’t be the source of this knowledge. Hope someone takes up the cause, but we live in a time of astounding cowardice.
Might as well talk about the probability of a conflict with South Africa, China might not be the best country to live in nor be country that takes care of its own citizens the best, but they seem non-violent towards other sovereign nations (so far), although of course there is a lot of posturing. But from the current "world powers", they seem to be the least violent.
And on who you would support in such a conflict! ;)
DeepSeek is not a charity, they are the largest hedge fund in China, nothing different from a typical wall street funds. They don't spend billions to give the world something open and free just because it is good.
When the model is capable of generating decent amount of revenues, or when there is conclusive evidence of showing being closed would lead to much higher profit, it will be closed.
(In before “whatabout”: maybe US-made models do the same, but I’ve yet to hear reports of any anti-US information that they’re censoring.)