rajhlinux (u/rajhlinux)

rajhlinux commented on Furiosa: 3.5x efficiency over H100s furiosa.ai/blog/introduci... · Posted by u/written-beyond

roughly · a month ago

I am of the opinion that Nvidia's hit the wall with their current architecture in the same way that Intel has historically with its various architectures - their current generation's power and cooling requirements are requiring the construction of entirely new datacenters with different architectures, which is going to blow out the economics on inference (GPU + datacenter + power plant + nuclear fusion research division + lobbying for datacenter land + water rights + ...).

The story with Intel around these times was usually that AMD or Cyrix or ARM or Apple or someone else would come around with a new architecture that was a clear generation jump past Intel's, and most importantly seemed to break the thermal and power ceilings of the Intel generation (at which point Intel typically fired their chip design group, hired everyone from AMD or whoever, and came out with Core or whatever). Nvidia effectively has no competition, or hasn't had any - nobody's actually broken the CUDA moat, so neither Intel nor AMD nor anyone else is really competing for the datacenter space, so they haven't faced any actual competitive pressure against things like power draws in the multi-kilowatt range for the Blackwells.

The reason this matters is that LLMs are incredibly nifty often useful tools that are not AGI and also seem to be hitting a scaling wall, and the only way to make the economics of, eg, a Blackwell-powered datacenter make sense is to assume that the entire economy is going to be running on it, as opposed to some useful tools and some improved interfaces. Otherwise, the investment numbers just don't make sense - the gap between what we see on the ground of how LLMs are used and the real but limited value add they can provide and the actual full cost of providing that service with a brand new single-purpose "AI datacenter" is just too great.

So this is a press release, but any time I see something that looks like an actual new hardware architecture for inference, and especially one that doesn't require building a new building or solving nuclear fusion, I'll take it as a good sign. I like LLMs, I've gotten a lot of value out of them, but nothing about the industry's finances add up right now.

rajhlinux · 21 days ago

Have you seen the specs? Consumer RTX 5090 is faster and cheaper than the Furiosa RNGD Gen 2. You gotta be mad stupid to buy something that performs worse and it is 5 times more expensive.

rajhlinux commented on Furiosa: 3.5x efficiency over H100s furiosa.ai/blog/introduci... · Posted by u/written-beyond

rajhlinux · 21 days ago

They just declined Meta's $800 million offer. What are they smoking? I just saw the specs and nothing is special about the Furiosa RNGD Gen 2 card compared to the RTX 5090. Sure, it has more SRAM, but that is not a deal breaker. The same goes for power consumption, data centers have incentives for power.

If each Furiosa RNGD Gen 2 card costs $10k while an RTX 5090 costs $2k, and the RTX 5090 has better performance for LLMs, you have to be mad stupid, have a personal grudge against Nvidia, or just want to burn cash for no good reason to rack up your data centers with Furiosa.

The value of their company is going to diminish and their next offer won't go over $1.5 billion. It will actually be less than $800 million since every year Nvidia, Intel, and other AI hardware startups introduce a better and faster card.

If Furiosa cards magically became cheaper than Nvidia's similar hardware, Furiosa might be worth a quarter billion dollars. I highly doubt this would ever happen because making AI compute with cutting edge lithography is hella expensive and involves heavy politics.

rajhlinux commented on All Sources of DirectX 12 Documentation asawicki.info/news_1794_a... · Posted by u/ibobev

rajhlinux · 2 months ago

No wonder cutting edge LLMs themselves are so confused with DX12 APIs... I guess I should stick with CUDA, they have so good documentation and support on everything all in one place.

Only reason I need to use DX12 is because the swapchain uses DXGI textures. I hope Microsoft gives NVIDIA great priority and leverage to use CUDA native textures at the swapchain directly in Windows.

DX12 functions ported directly into CUDA will be a big plus, programming stack is literally next to the NVIDIA hardware, do everything under one language and don't need to mess with two language interoperability.

rajhlinux commented on DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL arxiv.org/abs/2501.12948... · Posted by u/gradus_ad

leetharris · a year ago

I think there's likely lots of potential culprits. If the race is to make a machine god, states will pay countless billions for an advantage. Money won't mean anything once you enslave the machine god.

https://wccftech.com/nvidia-asks-super-micro-computer-smci-t...

rajhlinux · a year ago

Facts, them Chinese VCs will throw money to win.

rajhlinux commented on DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL arxiv.org/abs/2501.12948... · Posted by u/gradus_ad

mrbungie · a year ago

Then the question becomes, who sold the GPUs to them? They are supposedly scarse and every player in the field is trying to get ahold as many as they can, before anyone else in fact.

Something makes little sense in the accusations here.

rajhlinux · a year ago

Man, they say China is the most populated country in the world, I’m sure they got loopholes to grab a few thousands H100s.

They probably also trained the “copied” models by outsourcing it.

But who cares, it’s free and it works great.

rajhlinux commented on DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL arxiv.org/abs/2501.12948... · Posted by u/gradus_ad

leetharris · a year ago

If we're going to play that card, couldn't we also use the "Chinese CEO has every reason to lie and say they did something 100x more efficient than the Americans" card?

I'm not even saying they did it maliciously, but maybe just to avoid scrutiny on GPUs they aren't technically supposed to have? I'm thinking out loud, not accusing anyone of anything.

rajhlinux · a year ago

Bro, did you use Deepseek? That shyt is better than ChatGPT. No cards being thrown here.

rajhlinux commented on DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via RL arxiv.org/abs/2501.12948... · Posted by u/gradus_ad

leetharris · a year ago

CEO of Scale said Deepseek is lying and actually has a 50k GPU cluster. He said they lied in the paper because technically they aren't supposed to have them due to export laws.

I feel like this is very likely. They obvious did some great breakthroughs, but I doubt they were able to train on so much less hardware.

rajhlinux · a year ago

Deepseek is indeed better than Mistral and ChatGPT. It has tad more common sense. There is no way they did this on the “cheap”. I’m sure they use loads of Nvidia GPUs, unless they are using custom made hardware acceleration (that would be cool and easy to do).

As OP said, they are lying because of export laws, they aren’t allowed to play with Nvidia GPUs.

However, I support DeepSeek projects, I’m here in the US able to benefit from it. So hopefully they should headquarter in the States if they want US chip sanctions lift off since the company is Chinese based.

But as of now, deepseek takes the lead in LLMs, my goto LLM.

Sam Altman should be worried, seriously, Deepseek is legit better than ChatGPT latest models.