Nvidia Kicks Off the Next Generation of AI with Rubin

Whenever I see press on these new 'rack scale' systems, the first thing I think is something along the lines of: "man I hope the BIOS and OS's and whatnot supporting these racks are relatively robust and documented/open sourced enough so that 40 years from now when you can buy an entire rack system for $500, some kid in a garage will be able to boot and run code on these".

criemen · a month ago

What's the power hookup to just boot one rack? I'd imagine that's more than you get anywhere in residential areas for a single house.

embedding-shape · a month ago

Hopefully in 40 years we'll all be running miniature cold fusion power or something, so we can avoid burning the planet to the ground.

MisterTea · a month ago

Depends on the residence. I have personally seen a large house in Brooklyn with dual 200 amp 120/208 volt three phase services (two meters, each feeding a panel.) I have seen someone setup an old SGI rack scale Origin 3000 systems in their garage. I think they even had an electrician upgrade their service to accommodate it.

wmf · a month ago

170 kW

wmf · a month ago

The firmware is UEFI and Vera should have good upstream support. The GPU driver is proprietary though, so you'll have to dig up the last supported version from 2036.

Deleted Comment

If their new platform reduces inference token cost by 10x, does that play well or not well with the recently updated GPU deprecation schedules companies have been playing with to reduce projected cost outlays?

For context, my understanding is that companies have recently moved to mark their expected GPU deprecation cycles from 3 years to as high as 6 which has huge impacts on projected expenditures.

I wonder what the step was for the Blackwell platform from the previous. Is this slower which might indicate that the slower deprecation cycle is warranted, or faster?

drexlspivey · a month ago

No way you throw away Blackwell GPUs after just 3 years. Google runs 8 year old TPUs still at 100% utilization. Why would you depreciate them in just 3 years?

ryanmcgarvey · a month ago

The conversation around GPU lifecycles seems to be conflating the various shear rates within the data center. My layman understanding is that the old 3 year replacement cycle had more to do with some component, not necessarily the memory or the processor, going wrong for half of their units by 3 years, at which point GPUs were cheap enough and advancing faster enough that it was more cost effective to upgrade than to fix. However, that calculus changes completely when the GPU and the HBM are orders of magnitude more expensive than the rest of the system. I suspect that we will see repairs being done on on the various brittle bits of the system and the actual core expensive components will continue to operate much longer than 3 years.

UltraSane · a month ago

Companies are playing games with GPU depreciation.

cmxch · a month ago

The only thing learned from structured finance was to lock regular people out.

causal · a month ago

Unsure why you were downvoted; I'm curious to understand this comment. Playing finance and accounting games I presume you mean.

m3kw9 · a month ago

but token required for quality generation may increase as much very soon.

codyb · a month ago

Yea, definitely a good point. Going to be interesting to see how it plays out. I definitely do not have the expertise to answer the question

mk_stjames · a month ago

The blog post has more technical details and fewer quotes from customers: https://developer.nvidia.com/blog/inside-the-nvidia-rubin-pl...

mrandish · a month ago

That link was somewhat clearer, thanks.

As a software guy who follows chip evolution more at a macro level like: new design + process node enabling better cores/tiles/units/clocks + new architecture enabling better caches, busses, I/O == better IPC, bandwidth, latency and throughput at given budget (cost, watts, heat, space) - I've yet to find anything which gives a sense of Rubin's likely lift vs the prior generation that's grounded in macro-but-concrete specs (such as cores, tiles, units, clocks, caches, busses, IPC, bandwidth, latency, throughput).

Edit: I found something a bit closer after scrolling down on a sub-link from the page you linked (https://developer.nvidia.com/blog/inside-the-nvidia-rubin-pl...).

alecco · a month ago

For dev info we'll need to wait for GTC 2026 March 16–19. CES is just hype.

They're intentionally drip-feeding information over time until the actual release.

Animats · a month ago

Their own CPU, too - 88 ARM cores.

So it's an all-NVidia solution - CPU, interconnects, AI GPUs.

tibbydudeza · a month ago

Afaik MediaTek helped them with the CPU part.

TSiege · a month ago

Extreme Codesign Across NVIDIA Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU and Spectrum-6 Ethernet Switch Slashes Training Time and Inference Token Generation Cost

Technical details available here https://developer.nvidia.com/blog/inside-the-nvidia-rubin-pl...

Groxx · a month ago

... it took a couple searches to figure out that "extreme codesign" wasn't actually code-signing, but "co-design" like "stuff that was designed to work together"

utopiah · a month ago

Even << "co-design" like "stuff that was designed to work together" >> sound strange to me. Typically when I read about co-design is stuff that was designed together, by more than 1 party.

pyuser583 · a month ago

Me too. Good style says to avoid creating words with dashes - it’s Un-American. But clarity matters more than rules.

gilrain · a month ago

Is there any American style guide that insists hyphens be avoided even when a closed compound would cause ambiguity? I follow Chicago, but I imagine other style guides also already emphasise clarity.

mortehu · a month ago

Wouldn't "code sign" be two words in English? And "code signing" rather than "code sign"?

alfalfasprout · a month ago

same I was so confused

exacube · a month ago

does anyone know how well this 5x petaflop improvement translates to real world performance?

I know that memory bandwidth tends to be a big limiting factor, but I'm trying to understand how this factors into it its overall perf, compared to blackwell.

2OEH8eoCRo0 · a month ago

Rebuild all the data centers!

metalliqaz · a month ago

lol haven't even started building half the Blackwell datacenters yet