I personally find it "sort of" funny how nvidia is caught between what they need for gamers and what they need for IA, and how they're trying to pull all the tricks to not have their gamer cards eat up their "AI enthousiast" cards sales.
Absurd pricing, ridiculous vram offering, I'm sure they're trying very hard to find a way to stop AI like SD or LLMs to run on their gamer cards at this point.
It's reached a point where not only has ATI/AMD essentially caught up to them in rasterization, but they're frankly a better offer at every price point against the 4XXX generation for pure gaming, with only DLSS and brand recognition keeping nvidia ahead.
AMD is still lagging on software though. I wish they would get their act together but their drivers are just not anywhere near as stable or compatible as Nvidia's. And then there's Intel, and it makes you appreciate the state of AMDs...
The 4090 wins in every benchmark for 1/3rd the price. Why would anybody buy this card? Is 8 GB more VRAM and lower power consumption really worth that much when the performance is so lackluster?
You know at first I was thinking the same thing, but after having 2x 3090's raise the temperature of my bedroom to 86 degrees in the middle of the night last week fine-tuning an LLM, I could see the draw of 64GB (for 2) with 400 watts total less heat than the 4090's in my work space.
They claim PCIe4.0 makes it irrelevant, but that doesn't really make sense and it's most likely the case they want to charge a fortune for their high-memory options.
It is if you can pool the memory. It's easier than having to do the split of the models in software (though that's a somewhat solved problem) and from what I know allows higher GPU utilization on both cards when they don't have to wait for information to pass back and forth.
it is barely relevant to big players, but is extremely valuable for small players: distributing your workflow to multiple gpus manually is not that simple thing to do and there are a lot of much more interesting/important problem to solve than shoving your model to a gpu.
Why do they have this naming… it’s just insane. RTX 5000 Ada… do they just put letters in front or names at the back these days? So confusing and the consumer cards will also be rtx 50xx
Was about to post something questioning the "forever" part because my memory only starts to link Nvidia generations with scientist somewhere around Kepler. And that's despite having followed GPU tech a lot more in the years before. But according to Wikipedia it goes back to the days of the Riva TNT: the wiki seems undecided about Fahrenheit-ness of earlier generations, but I'd consider that close enough for "forever".
I believe the issue with Lovelace is that you may find less than PG results typing that on a search engine. Hence using Ada primarily on the marketing.
I think the complaint is more with the consumer card being 4xxx but this is 5000 both on the same architecture.
Yes but they historically do not put the architecture name in the product name like that. They also never use the naming scheme of their consumer graphics cards on their workstation line. They've always had a different system. This card breaks both of those conventions. Workstation cards are supposed to be Quadro. But it looks like they've rebranded the line as "Nvidia RTX". I can only assume that was an intentional move to make their lineup more confusing.
NVLink is essential for training large neural networks, which NVidia now earn majority of revenue from. Their sales of more expensive GPUs will be affected if they have NVLink in cheaper GPUs.
The issue is that this has caused severe shortages. The only new card with NVLink is at the very highest end and when trying to get a quote recently, I was told there was a 13 month delay in shipping. But if I don't need NVLink, just a few months.
At this pricing level, with this amount of RAM, I suspect a lot of use cases will be with ML and GenAI. Benchmarks for these use cases would have been interesting.
it is 20-30% slower than 4090 or H100 in compute, the only improvement is slightly more RAM. This card is not for ML (on purpose) - it is for more enterprisish tasks: some advanced video streaming/rendering, virtualization, etc.
It's for some ML tasks. Just not large language models.
If you're making, say, an ML-based on-premise CCTV system and you need to run several large ResNets at the same time? And you don't want to go rack-mounted, as some sites don't have a data centre? And you want the longer lifecycle and guaranteed spare parts availability of an enterprise product line? This could be the card for you.
Admittedly it's a rip-off, but the Workstation/Quadro line always has been.
Honestly I'm not sure how healthy the workstation market is right now - with the rise of work-from-home and hybrid working, I don't see many people using huge desktops any more. And when Adobe puts a powerful generative AI feature into Photoshop, they don't expect users to upgrade to powerful GPUs - they run it in the cloud, so it works for users with puny GPUs and Adobe can get that sweet sweet recurring revenue.
To be honest, the best benchmark you can run is your own training code. Everything else is a guess.
When I tested the A6000 against the H100, there wasn’t that big of a boost from the newer card. Perhaps GPU operations weren’t the bottleneck in that case.
> To be honest, the best benchmark you can run is your own training code. Everything else is a guess.
Yes, but the point of a review with benchmarks is that it is expensive and time-consuming for a customer to acquire the hardware just to run their own benchmarks on.
Stable Diffusion and various LLMs are available pretty easily.
A simple benchmark that this version of stable diffusion/llm was used with these settings and this is how long it took to produce image/we got this many tokens/sec would be a nice comparison that you are in a good position to do with access to all the hardware.
Absurd pricing, ridiculous vram offering, I'm sure they're trying very hard to find a way to stop AI like SD or LLMs to run on their gamer cards at this point.
It's reached a point where not only has ATI/AMD essentially caught up to them in rasterization, but they're frankly a better offer at every price point against the 4XXX generation for pure gaming, with only DLSS and brand recognition keeping nvidia ahead.
If 300M people in the US, let alone 7B in the world, need even a 4060 to run basic business workloads, NVidia is sitting pretty.
The 4090 wins in every benchmark for 1/3rd the price. Why would anybody buy this card? Is 8 GB more VRAM and lower power consumption really worth that much when the performance is so lackluster?
"fewer", not "less".
(Not saying you shouldn't game on a workstation, but a workstation card will be worse at it especially for its price)
I guess what's unique with Ada is that they're using her first name? Though most official sources call it Ada Lovelace in full.
I think the complaint is more with the consumer card being 4xxx but this is 5000 both on the same architecture.
They keep removing features
Deleted Comment
If you're making, say, an ML-based on-premise CCTV system and you need to run several large ResNets at the same time? And you don't want to go rack-mounted, as some sites don't have a data centre? And you want the longer lifecycle and guaranteed spare parts availability of an enterprise product line? This could be the card for you.
Admittedly it's a rip-off, but the Workstation/Quadro line always has been.
Honestly I'm not sure how healthy the workstation market is right now - with the rise of work-from-home and hybrid working, I don't see many people using huge desktops any more. And when Adobe puts a powerful generative AI feature into Photoshop, they don't expect users to upgrade to powerful GPUs - they run it in the cloud, so it works for users with puny GPUs and Adobe can get that sweet sweet recurring revenue.
When I tested the A6000 against the H100, there wasn’t that big of a boost from the newer card. Perhaps GPU operations weren’t the bottleneck in that case.
Yes, but the point of a review with benchmarks is that it is expensive and time-consuming for a customer to acquire the hardware just to run their own benchmarks on.
Stable Diffusion and various LLMs are available pretty easily.
A simple benchmark that this version of stable diffusion/llm was used with these settings and this is how long it took to produce image/we got this many tokens/sec would be a nice comparison that you are in a good position to do with access to all the hardware.