Readit News logoReadit News
Abishek_Muthian · 8 months ago
I'm looking at my Jetson Nano in the corner which is fulfilling its post-retirement role as a paper weight because Nvidia abandoned it in 4 years.

Nvidia Jetson Nano, A SBC for "AI" debuted with already aging custom Ubuntu 18.04 and when 18.04 went EOL, Nvidia abandoned it completely without any further updates to its proprietary jet-pack or drivers and without them all of Machine Learning stack like CUDA, Pytorch etc. became useless.

I'll never buy a SBC from Nvidia unless all the SW support is up-streamed to Linux kernel.

lolinder · 8 months ago
This is a very important point.

In general, Nvidia's relationship with Linux has been... complicated. On the one hand, at least they offer drivers for it. On the other, I have found few more reliable ways to irreparably break a Linux installation than trying to install or upgrade those drivers. They don't seem to prioritize it as a first class citizen, more just tolerate it the bare minimum required to claim it works.

dotancohen · 8 months ago

  > Nvidia's relationship with Linux has been... complicated.
For those unfamiliar with Linus Torvalds' two-word opinion of Nvidia:

https://youtube.com/watch?v=OF_5EKNX0Eg

stabbles · 8 months ago
Now that the majority of their revenue is from data centers instead of Windows gaming PCs, you'd think their relationship with Linux should improve or already has.
FuriouslyAdrift · 8 months ago
The Digits device runs the same nVidia DGX OS (nVidia custom Ubuntu distro) that they run on their cloud infra.
vladslav · 8 months ago
I've had a similar experience, my Xavier NX stopped working after the last update and now it's just collecting dust. To be honest, I've found the Nvidia SBC to be more of a hassle than it's worth.
busterarm · 8 months ago
Xavier AGX owner here to report the same.
aseipp · 8 months ago
The Orin series and later use UEFI and you can apparently run upstream, non-GPU enabled kernels on them. There's a user guide page documenting it. So I think it's gotten a lot better, but it's sort of moot because the non-GPU thing is because the JetPack Linux fork has a specific 'nvgpu' driver used for Tegra devices that hasn't been unforked from that tree. So, you can buy better alternatives unless you're explicitly doing the robotics+AI inference edge stuff.

But the impression I get from this device is that it's closer in spirit to the Grace Hopper/datacenter designs than it is the Tegra designs, due to both the naming, design (DGX style) and the software (DGX OS?) which goes on their workstation/server designs. They are also UEFI, and in those scenarios, you can (I believe?) use the upstream Linux kernel with the open source nvidia driver using whatever distro you like. In that case, this would be a much more "familiar" machine with a much more ordinary Linux experience. But who knows. Maybe GH200/GB200 need custom patches, too.

Time will tell, but if this is a good GPU paired with a good ARM Cortex design, and it works more like a traditional Linux box than the Jeton series, it may be a great local AI inference machine.

moondev · 8 months ago
AGX also has UEFI firmware which allows you to install ESXi. Then you can install any generic EFI arm64 iso in a VM with no problems, including windows.
halJordan · 8 months ago
It runs their dgx os and Jensen specifically said it would be a full part if their hw stack
startupsfail · 8 months ago
If this is DGX OS, then yes, this is what you’ll find installed on their 4-cards workstations.

This is more like a micro-DGX then, for $3k.

yoyohello13 · 8 months ago
And unless there is some expanded maintenance going on, 22.04 is EOL in 2 years. In my experience, vendors are not as on top of security patches as upstream. We will see, but given NVIDIA's closed ecosystem, I don't have high hopes that this will be supported long term.
saidinesh5 · 8 months ago
Is there any recent, powerful SBC with fully upstream kernel support?

I can only think of raspberry pi...

sliken · 8 months ago
rk3588 is pretty close, I believe it's usable today, just missing a few corner cases with HDMI or some such. I believe that last patches are either pending or already applied to an RC.
shadowpho · 8 months ago
Radha but that’s n100 aka x64
msh · 8 months ago
The odroid H series. But that packs a x86 cpu.
nickpsecurity · 8 months ago
If its stack still works, you might be able to sell or donate it to a student experimenting. They can still learn quite a few things with it. Maybe even use it for something.
sangnoir · 8 months ago
Using outdated tensorflow (v1 from 2018) or outdated PyTorch makes learning harder than it need to be, considering most resources online use much newer versions of the frameworks. If you're learning the fundamentals and working from first principle and creating the building blocks yourself, then it adds to the experience. However, most most people just want to build different types of nets, and it's hard to do when the code won't work for you.
tcdent · 8 months ago
If you're expecting this device to stay relevant for 4 years you are not the target demographic.

Compute is evolving way too rapidly to be setting-and-forgetting anything at the moment.

tempoponet · 8 months ago
Today I'm using 2x 3090's which are over 4 years old at this point and still very usable. To get 48gb vram I would need 3x 5070ti - still over $2k.

In 4 years, you'll be able to combine 2 of these to get 256gb unified memory. I expect that to have many uses and still be in a favorable form factor and price.

mrybczyn · 8 months ago
Eh? By all indications compute is now evolving SLOWER than ever. Moore's Law is dead, Dennard scaling is over, the latest fab nodes are evolutionary rather than revolutionary.

This isn't the 80s when compute doubled every 9 months, mostly on clock scaling.

Karupan · 8 months ago
I feel this is bigger than the 5x series GPUs. Given the craze around AI/LLMs, this can also potentially eat into Apple’s slice of the enthusiast AI dev segment once the M4 Max/Ultra Mac minis are released. I sure wished I held some Nvidia stocks, they seem to be doing everything right in the last few years!
rbanffy · 8 months ago
This is something every company should make sure they have: an onboarding path.

Xeon Phi failed for a number of reasons, but one where it didn't need to fail was availability of software optimised for it. Now we have Xeons and EPYCs, and MI300C's with lots of efficient cores, but we could have been writing software tailored for those for 10 years now. Extracting performance from them would be a solved problem at this point. The same applies for Itanium - the very first thing Intel should have made sure it had was good Linux support. They could have it before the first silicon was released. Itaium was well supported for a while, but it's long dead by now.

Similarly, Sun has failed with SPARC, which also didn't have an easy onboarding path after they gave up on workstations. They did some things right: OpenSolaris ensured the OS remained relevant (still is, even if a bit niche), and looking the other way for x86 Solaris helps people to learn and train on it. Oracle cloud could, at least, offer it on cloud instances. Would be nice.

Now we see IBM doing the same - there is no reasonable entry level POWER machine that can compete in performance with a workstation-class x86. There is a small half-rack machine that can be mounted on a deskside case, and that's it. I don't know of any company that's planning to deploy new systems on AIX (much less IBMi, which is also POWER), or even for Linux on POWER, because it's just too easy to build it on other, competing platforms. You can get AIX, IBMi and even IBMz cloud instances from IBM cloud, but it's not easy (and I never found a "from-zero-to-ssh-or-5250-or-3270" tutorial for them). I wonder if it's even possible. You can get Linux on Z instances, but there doesn't seem to be a way to get Linux on POWER. At least not from them (several HPC research labs still offer those).

nimish · 8 months ago
1000% all these ai hardware companies will fail if they don't have this. You must have a cheap way to experiment and develop. Even if you want to only sell a $30000 datacenter card you still need a very low cost way to play.

Sad to see big companies like intel and amd don't understand this but they've never come to terms with the fact that software killed the hardware star

AtlasBarfed · 8 months ago
It really mystifies me that Intel AMD and other hardware companies obviously Nvidia in this case Don't either have a consortium or each have their own in-house Linux distribution with excellent support.

Windows has always been a barrier to hardware feature adoption to Intel. You had to wait 2 to 3 years, sometimes longer, for Windows to get around us providing hardware support.

Any OS optimizations in Windows you had to go through Microsoft. So say you added some instructions custom silicon or whatever to speed up Enterprise databases, provide high-speed networking that needed some special kernel features, etc, there was always Microsoft being in the way.

Not just in the drag the feet communication. Getting the tech people a line problem.

Microsoft will look at every single change. It did as to whether or not it would challenge their Monopoly whether or not it was in their business interest whether or not it kept you as the hardware and a subservient role.

p_ing · 8 months ago
Raptor Computing provides POWER9 workstations. They're not cheap, still use last-gen hardware (DDR4/PCIe 4 ... and POWER9 itself) but they're out there.

https://www.raptorcs.com/content/base/products.html

UncleOxidant · 8 months ago
There were Phi cards, but they were pricey and power hungry (at the time, now current GPU cards probably meet or exceed the Phi card's power consumption) for plugging into your home PC. A few years back there was a big fire sale on Phi cards - you could pick one up for like $200. But by then nobody cared.
sheepscreek · 8 months ago
The developers they are referring to aren’t just enthusiasts; they are also developers who were purchasing SuperMicro and Lambda PCs to develop models for their employers. Many enterprises will buy these for local development because it frees up the highly expensive enterprise-level chip for commercial use.

This is a genius move. I am more baffled by the insane form factor that can pack this much power inside a Mac Mini-esque body. For just $6000, two of these can run 400B+ models locally. That is absolutely bonkers. Imagine running ChatGPT on your desktop. You couldn’t dream about this stuff even 1 year ago. What a time to be alive!

HarHarVeryFunny · 8 months ago
The 1 PetaFLOP spec and 200GB model capacity specs are for FP4 (4-bit floating point), which means inference not training/development. It's still be a decent personal development machine, but not for that size of model.
numba888 · 8 months ago
This looks like a bigger brother of Orin AGX, which has 64GB of RAM and runs smaller LLMs. The question will be power and performance vs 5090. We know price is 1.5x
stogot · 8 months ago
How does it run 400B models across two? I didn’t see that in the article
dagmx · 8 months ago
I think the enthusiast side of things is a negligible part of the market.

That said, enthusiasts do help drive a lot of the improvements to the tech stack so if they start using this, it’ll entrench NVIDIA even more.

Karupan · 8 months ago
I’m not so sure it’s negligible. My anecdotal experience is that since Apple Silicon chips were found to be “ok” enough to run inference with MLX, more non-technical people in my circle have asked me how they can run LLMs on their macs.

Surely a smaller market than gamers or datacenters for sure.

qwertox · 8 months ago
You could have said the same about gamers buying expensive hardware in the 00's. It's what made Nvidia big.
gr3ml1n · 8 months ago
AMD thought the enthusiast side of things was a negligible side of the market.
epolanski · 8 months ago
If this is gonna be widely used by ML engineers, in biopharma, etc and they land 1000$ margins at half a million sales that's half a billion in revenue, with potential to grow.
option · 8 months ago
today’s enthusiast, grad student, hacker is tomorrow’s startup founder, CEO, CTO or 10x contributor in large tech company
VikingCoder · 8 months ago
If I were NVidia, I would be throwing everything I could at making entertainment experiences that need one of these to run...

I mean, this is awfully close to being "Her" in a box, right?

computably · 8 months ago
Yeah, it's more about preempting competitors from attracting any ecosystem development than the revenue itself.
bloomingkales · 8 months ago
Jensen did say in recent interview, paraphrasing, “they are trying to kill my company”.

Those Macs with unified memory is a threat he is immediately addressing. Jensen is a wartime ceo from the looks of it, he’s not joking.

No wonder AMD is staying out of the high end space, since NVIDIA is going head on with Apple (and AMD is not in the business of competing with Apple).

T-A · 8 months ago
From https://www.tomshardware.com/pc-components/cpus/amds-beastly...

The fire-breathing 120W Zen 5-powered flagship Ryzen AI Max+ 395 comes packing 16 CPU cores and 32 threads paired with 40 RDNA 3.5 (Radeon 8060S) integrated graphics cores (CUs), but perhaps more importantly, it supports up to 128GB of memory that is shared among the CPU, GPU, and XDNA 2 NPU AI engines. The memory can also be carved up to a distinct pool dedicated to the GPU only, thus delivering an astounding 256 GB/s of memory throughput that unlocks incredible performance in memory capacity-constrained AI workloads (details below). AMD says this delivers groundbreaking capabilities for thin-and-light laptops and mini workstations, particularly in AI workloads. The company also shared plenty of gaming and content creation benchmarks.

[...]

AMD also shared some rather impressive results showing a Llama 70B Nemotron LLM AI model running on both the Ryzen AI Max+ 395 with 128GB of total system RAM (32GB for the CPU, 96GB allocated to the GPU) and a desktop Nvidia GeForce RTX 4090 with 24GB of VRAM (details of the setups in the slide below). AMD says the AI Max+ 395 delivers up to 2.2X the tokens/second performance of the desktop RTX 4090 card, but the company didn’t share time-to-first-token benchmarks.

Perhaps more importantly, AMD claims to do this at an 87% lower TDP than the 450W RTX 4090, with the AI Max+ running at a mere 55W. That implies that systems built on this platform will have exceptional power efficiency metrics in AI workloads.

nomel · 8 months ago
> since NVIDIA is going head on with Apple

I think this is a race that Apple doesn't know it's part of. Apple has something that happens to work well for AI, as a side effect of having a nice GPU with lots of fast shared memory. It's not marketed for inference.

JoshTko · 8 months ago
Which interview was this?
hkgjjgjfjfjfjf · 8 months ago
You missed the Ryzen hx ai pro 395 product announcement
llm_trw · 8 months ago
From the people I talk to the enthusiast market is nvidia 4090/3090 saturated because people want to do their fine tunes also porn on their off time. The Venn diagram of users who post about diffusion models and llms running at home is pretty much a circle.
dist-epoch · 8 months ago
Not your weights, not your waifu
Tostino · 8 months ago
Yeah, I really don't think the overlap is as much as you imagine. At least in /r/localllama and the discord servers I frequent, the vast majority of users are interested in one or the other primarily, and may just dabble with other things. Obviously this is just my observations...I could be totally misreading things.
numba888 · 8 months ago
> I sure wished I held some Nvidia stocks, they seem to be doing everything right in the last few years!

They propelled on unexpected LLM boom. But plan 'A' was robotics in which NVidia invested a lot for decades. I think their time is about to come, with Tesla's humanoids for 20-30k and Chinese already selling for $16k.

qwertox · 8 months ago
This is somewhat similar to what GeForce was to gamers back in the days, but for AI enthusiasts. Sure, the price is much higher, but at least it's a completely integrated solution.
Karupan · 8 months ago
Yep that's what I'm thinking as well. I was going to buy a 5090 mainly to play around with LLM code generation, but this is a worthy option for roughly the same price as building a new PC with a 5090.
trhway · 8 months ago
>enthusiast AI dev segment

i think it isn't about enthusiast. To me it looks like Huang/NVDA is pushing further a small revolution using the opening provided by the AI wave - up until now the GPU was add-on to the general computing core onto which that computing core offloaded some computing. With AI that offloaded computing becomes de-facto the main computing and Huang/NVDA is turning tables by making the CPU is just a small add-on on the GPU, with some general computing offloaded to that CPU.

The CPU being located that "close" and with unified memory - that would stimulate development of parallelization for a lot of general computing so that it would be executed on GPU, very fast that way, instead of on the CPU. For example classic of enterprise computing - databases, the SQL ones - a lot, if not, with some work, everything, in these databases can be executed on GPU with a significant performance gain vs. CPU. Why it isn't happening today? Load/unload onto GPU eats into performance, complexity of having only some operations offloaded to GPU is very high in dev effort, etc. Streamlined development on a platform with unified memory will change it. That way Huang/NVDA may pull out rug from under the CPU-first platforms like AMD/INTC and would own both - new AI computing as well as significant share of the classic enterprise one.

tatersolid · 8 months ago
> these databases can be executed on GPU with a significant performance gain vs. CPU

No, they can’t. GPU databases are niche products with severe limitations.

GPUs are fast at massively parallel math problems, they anren’t useful for all tasks.

tarsinge · 8 months ago
> I sure wished I held some Nvidia stocks

I’m so tired of this recent obsession with the stock market. Now that retail is deeply invested it is tainting everything, like here on a technology forum. I don’t remember people mentioning Apple stock every time Steve Jobs made an announcement in the past decades. Nowadays it seems everyone is invested in Nvidia and just want the stock to go up, and every product announcement is a mean to that end. I really hope we get a crash so that we can get back to a more sane relation with companies and their products.

lioeters · 8 months ago
> hope we get a crash

That's the best time to buy. ;)

paxys · 8 months ago
“Bigger” in what sense? For AI? Sure, because this an AI product. 5x series are gaming cards.
a________d · 8 months ago
Not expecting this to compete with the 5x series in terms of gaming; But it's interesting to note the increase in gaming performance Jensen was speaking about with Blackwell was larger related to inferenced frames generated by the tensor cores.

I wonder how it would go as a productivity/tinkering/gaming rig? Could a GPU potentially be stacked in the same way an additional Digit can?

Karupan · 8 months ago
Bigger in the sense of the announcements.
AuryGlenz · 8 months ago
Eh. Gaming cards, but also significantly faster. If the model fits in the VRAM the 5090 is a much better buy.
GaryNumanVevo · 8 months ago
I bet $100k on NVIDIA stocks ~7 years ago, just recently closed out a bunch of them
axegon_ · 8 months ago
> they seem to be doing everything right in the last few years

About that... Not like there isn't a lot to be desired from the linux drivers: I'm running a K80 and M40 in a workstation at home and the thought of having to ever touch the drivers, now that the system is operational, terrifies me. It is by far the biggest "don't fix it if it ain't broke" thing in my life.

sliken · 8 months ago
Use a filesystem that snapshots AND do a complete backup.
mycall · 8 months ago
Buy a second system which you can touch?
technofiend · 8 months ago
Will there really be a mac mini wirh Max or Ultra CPUs? This feels like somewhat of an overlap with the Mac Studio.
adolph · 8 months ago
There will undoubtably be a Mac Studio (and Mac Pro?) bump to M4 at some point. Benchmarks [0] reflect how memory bandwidth and core count [1] compare to processor improvements. Granted, ymmv to your workload.

0. https://www.macstadium.com/blog/m4-mac-mini-review

1. https://www.apple.com/mac/compare/?modelList=Mac-mini-M4,Mac...

wslh · 8 months ago
The nVidia price is closer (USD 3k) to a top Mac mini but I trust Apple more for the end-to-end support from hardware to apps than nVidia. Not an Apple fanboy but an user/dev, and I don't think we realize what Apple really achieved, industrially speaking. The M1 was launched in late 2020.
croes · 8 months ago
Did they say anything about power consumption?

Apple M chips are pretty efficient.

behringer · 8 months ago
Not only that, but it should help free up the gpus for the gamers.
puppymaster · 8 months ago
it eats into all NVDA consumer-facing clients no? I can see why openai and etc are looking for alternative hardware solution to train their next model.
iKevinShah · 8 months ago
I can confirm this is the case (for me).
informal007 · 8 months ago
I would like to have Mac as my personal computer and digits as service to run llm.
csomar · 8 months ago
Am I the only one disappointed by these? They cost roughly half the price of a macbook pro and offer hmm.. half the capacity in RAM. Sure speed matters in AI, but what do I do with speed when I can't load a 70b model.

On the other hand, with a $5000 macbook pro, I can easily load a 70b model and have a "full" macbook pro as a plus. I am not sure I fully understand the value of these cards for someone that want to run personal AI models.

gnabgib · 8 months ago
Are you, perhaps, commenting on the wrong thread? Project Digits is a $3k 128GB computer.. the best your your $5K MBP can have for ram is.. 128GB.
rictic · 8 months ago
Hm? They have 128GB of RAM. Macbook Pros cap out at 128GB as well. Will be interesting to see how a Project Digits machine performs in terms of inference speed.
macawfish · 8 months ago
Then buy two and stack them!

Also I'm unfamiliar with macs is there really a MacBook pro with 256GB of RAM?

maniroo · 8 months ago
Bro we can connect two ProjectDigits as well. I was only looking at the M4 macbook because 128gb unified memory. Now this beast can cook better LLMs at just 3K with 4TB SSD too. M4 Macbook Max (128 GB unified ram and 4TB Storage) is 5999. So, No more apple for me. I will just get the Digits. And can create a workstation as well.
doctorpangloss · 8 months ago
What slice?

Also, macOS devices are not very good inference solutions. They are just believed to be by diehards.

I don't think Digits will perform well either.

If NVIDIA wanted you to have good performance on a budget, it would ship NVLink on the 5090.

Karupan · 8 months ago
They are perfectly fine for certain people. I can run Qwen-2.5-coder 14B on my M2 Max MacBook Pro with 32gb at ~16 tok/sec. At least in my circle, people are budget conscious and would prefer using existing devices rather than pay for subscriptions where possible.

And we know why they won't ship NVLink anymore on prosumer GPUs: they control almost the entire segment and why give more away for free? Good for the company and investors, bad for us consumers.

YetAnotherNick · 8 months ago
> Also, macOS devices are not very good inference solutions

They are good for single batch inference and have very good tok/sec/user. ollama works perfectly in mac.

narrator · 8 months ago
Nvidia releases a Linux desktop supercomputer that's better price/performance wise than anything Wintel is doing and their whole new software stack will only run on WSL2. They aren't porting to Win32. Wow, it may actually be the year of Linux on the Desktop.
sliken · 8 months ago
Not sure how to judge better price/perf. I wouldn't expect 20 Neoverse N2 cores to do particularly well vs 16 zen5 cores. The GPU side looks promising, but they aren't mentioning memory bandwidth, configuration, spec, or performance.

Did see vague claims of "starting at $3k", max 4TB nvme, and max 128GB ram.

I'd expect AMD Strix Halo (AI Max plus 395) to be reasonably competitive.

skavi · 8 months ago
It’s actually “10 Arm Cortex-X925 and 10 Cortex-A725” [0]. These are much newer cores and have a reasonable chance of being competitive.

[0]: https://newsroom.arm.com/blog/arm-nvidia-project-digits-high...

z4y5f3 · 8 months ago
NVIDIA is likely citing 1 PFlops at FP 4 sparse (they did this for GB200), so that is 128 TFlops BF16 dense, or 2/3 of what RTX 4090 is capable of. I would put the memory bandwidth at 546 GBps, using the same 512 bit LPDDR5X 8533 Mbps as Apple M4 max.
bee_rider · 8 months ago
Seems more like a workstation. So, that’s just a continuation of the last could Decades of Unix on the Workstation, right?
throw310822 · 8 months ago
They should write an AI-centered OS for it, allowing people to write easily AI heavy applications. And you'd have the Amiga of 2025.
pjmlp · 8 months ago
Because NVidia naturally doesn't want to pay for Windows licenses.

NVidia works closely with Microsoft to develop their cards, all major features come first in DirectX, before landing on Vulkan and OpenGL as NVidia extensions, and eventually become standard after other vendors follow up with similar extensions.

CamperBob2 · 8 months ago
Where does it say they won't be supporting Win32?
narrator · 8 months ago
Here he says that in order for the cloud and the PC to be compatible, he's going to only support WSL2, the Windows subsystem for Linux which is a Linux API on top of Windows.

Here's a link to the part of the keynote where he says this:

https://youtu.be/MC7L_EWylb0?t=7259

diggan · 8 months ago
> their whole new software stack will only run on WSL2. They aren't porting to Win32

Wait, what do you mean exactly? Isn't WSL2 just a VM essentially? Don't you mean it'll run on Linux (which you also can run on WSL2)?

Or will it really only work with WSL2? I was excited as I thought it was just a Linux Workstation, but if WSL2 gets involved/is required somehow, then I need to run the other direction.

awestroke · 8 months ago
No, nobody will run windows on this. It's meant to run NVIDIAs own flavor of Ubuntu with a patched kernel
hx8 · 8 months ago
Yes, WSL2 is essentially a highly integrated VM. I think it's a bit of a joke to call Ubuntu WSL2, because it seems like most Ubuntu installs are either VMs for Windows PCs or on Azure Cloud.
rvz · 8 months ago
> Wow, it may actually be the year of Linux on the Desktop.

?

Yeah starting at $3,000. Surely a cheap desktop computer to buy for someone who just wants to surf the web and send email /s.

There is a reason why it is for "enthusiasts" and not for the general wider consumer or typical PC buyer.

Topfi · 8 months ago
I see the most direct competitor in the Mac Studio, though of course we will have to wait for reviews to gauge how fair that comparison is. The Studio does have a fairly large niche as a solid workstation, though, so I could see this being successful.

For general desktop use, as you described, nearly any piece of modern hardware, from a RasPI, to most modern smartphones with a dock, could realistically serve most people well.

The thing is, you need to serve both, low-end use cases like browsing, and high-end dev work via workstations, because even for the "average user", there is often one specific program on which they need to rely and which has limited support outside the OS they have grown up with. Course, there will be some programs like Desktop Microsoft Office which will never be ported, but still, Digitis could open the doors to some devs working natively on Linux.

A solid, compact, high-performance, yet low power workstation with a fully supported Linux desktop out of the box could bridge that gap, similar to how I have seen some developers adopt macOS over Linux and Windows since the release of the Studio and Max MacBooks.

Again, we have yet to see independent testing, but I would be surprised if anything of this size, simplicity, efficiency and performance was possible in any hardware configuration currently on the market.

yjftsjthsd-h · 8 months ago
> Surely a cheap desktop computer to buy for someone who just wants to surf the web and send email /s.

That end of the market is occupied by Chromebooks... AKA a different GNU/Linux.

fooker · 8 months ago
The typical PC buyer is an enthusiast now.
immibis · 8 months ago
Never underestimate the open source world's power to create a crappy desktop experience.
tokai · 8 months ago
You're like 15 years out of date.
derbaum · 8 months ago
I'm a bit surprised by the amount of comments comparing the cost to (often cheap) cloud solutions. Nvidia's value proposition is completely different in my opinion. Say I have a startup in the EU that handles personal data or some company secrets and wants to use an LLM to analyse it (like using RAG). Having that data never leave your basement sure can be worth more than $3000 if performance is not a bottleneck.
lolinder · 8 months ago
Heck, I'm willing to pay $3000 for one of these to get a good model that runs my requests locally. It's probably just my stupid ape brain trying to do finance, but I'm infinitely more likely to run dumb experiments with LLMs on hardware I own than I am while paying per token (to the point where I currently spend way more time with small local llamas than with Claude), and even though I don't do anything sensitive I'm still leery of shipping all my data to one of these companies.

This isn't competing with cloud, it's competing with Mac Minis and beefy GPUs. And $3000 is a very attractive price point in that market.

logankeenan · 8 months ago
Have you been to the localLlama subreddit? It’s a great resource for running models locally. It’s what got me started.

https://www.reddit.com/r/LocalLLaMA/

ynniv · 8 months ago
I'm pretty frugal, but my first thought is to get two to run 405B models. Building out 128GB of VRAM isn't easy, and will likely cost twice this.
sensesp · 8 months ago
100% I see many SMEs not willing to send their data to some cloud black box.
jckahn · 8 months ago
Exactly this. I would happily give $3k to NVIDIA to avoid giving 1 cent to OpenAI/Anthropic.
originalvichy · 8 months ago
Even for established companies this is great. A tech company can have a few of these locally hosted and users can poll the company LLM with sensitive data.
diggan · 8 months ago
The price seems relatively competitive even compared to other local alternatives like "build your own PC". I'd definitely buy one of this (or even two if it works really well) for developing/training/using models that currently run on cobbled together hardware I got left after upgrading my desktop.
627467 · 8 months ago
> Having that data never leave your basement sure can be worth more than $3000 if performance is not a bottleneck

I get what you're saying, but there are also regulations (and your own business interest) that expects data redundancy/protection which keeping everything on-site doesnt seem to cover

btbuildem · 8 months ago
Yeah that's cheaper than many prosumer GPUs on the market right now
a_bonobo · 8 months ago
There's a market not described here: bioinformatics.

The owner of the market, Illumina, already ships their own bespoke hardware chips in servers called DRAGEN for faster analysis of thousands of genomes. Their main market for this product is in personalised medicine, as genome sequencing in humans is becoming common.

Other companies like Oxford Nanopore use on-board GPUs to call bases (i.e., from raw electric signal coming off the sequencer to A, T, G, C) but it's not working as well as it could due to size and power constraints. I feel like this could be a huge game changer for someone like ONT, especially with cooler stuff like adaptive sequencing.

Other avenues of bioinformatics, such as most day-to-day analysis software, is still very CPU and RAM heavy.

evandijk70 · 8 months ago
This is, at least for now, a relatively small market. Illumina acquired the company manufacturing these chips for $100M. Analysis of a genome in the cloud generally costs below $10 on general purpose hardware.

It is of course possible that these chips enable analyses that are currently not possible/prohibited by cost, but at least for now, this will not be the limiting factor for genomics, but cost of sequencing (which is currently $400-500 per genome)

mocheeze · 8 months ago
Doesn't seem like Illumina actually cares much about security: https://arstechnica.com/security/2025/01/widely-used-dna-seq...
mycall · 8 months ago
The bigger picture is that OpenAI o3/o4.. plus specialized models will blow open the doors to genome tagging and discovery, but that is still 1 to 3 years away for ASI to kick in.
nzach · 8 months ago
While I kinda agree with you, I don't think we will ever find a meaningful way to throw genome sequencing data at LLMs. It's simple too much data.

I've worked in a project some years ago where we were using data from genome sequencing of a bacteria. Every sequenced sample was around 3GB of data and sample size was pretty small with only about 100 samples to study.

I think the real revolution will happen because code generation through LLMs will allow biologists to write 'good enough' code to transform, process and analyze data. Today to do any meaningful work with genome data you need a pretty competent bioinformatician, and they are a rare breed. Removing this bottleneck is what will allow us to move faster in this field.

newsclues · 8 months ago
Is this for research labs, health clinics, or peoples homes?
a_bonobo · 8 months ago
ONT sells its smallest MinION to regular people, too. But Illumina's and ONT's main market is universities, followed by large hospitals
mfld · 8 months ago
Small nitpick: Illumia is the owner of the sequencing market. But not really of the bioinformatics market.
neom · 8 months ago
In case you're curious, I googled. It runs this thing called "DGX OS":

"DGX OS 6 Features The following are the key features of DGX OS Release 6:

Based on Ubuntu 22.04 with the latest long-term Linux kernel version 5.15 for the recent hardware and security updates and updates to software packages, such as Python and GCC.

Includes the NVIDIA-optimized Linux kernel, which supports GPU Direct Storage (GDS) without additional patches.

Provides access to all NVIDIA GPU driver branches and CUDA toolkit versions.

Uses the Ubuntu OFED by default with the option to install NVIDIA OFED for additional features.

Supports Secure Boot (requires Ubuntu OFED).

Supports DGX H100/H200."

AtlasBarfed · 8 months ago
Nvidia optimize meaning non-public patches, a non-upgradable operating system like what happens if you upgrade with a binary blob Nvidia driver?
wmf · 8 months ago
You can upgrade to a newer release of DGX OS.
yoyohello13 · 8 months ago
I wonder what kind of spyware is loaded onto DGX OS. Oh, sorry I mean telemetry.
ZeroTalent · 8 months ago
Cybersecurity analysts check and monitor these things daily, and they are pretty easy to catch. Likely nothing malicious, as history shows.
thunkshift1 · 8 months ago
Correct, highly concerning.. this is totally not the case with existing os’s and products
treprinum · 8 months ago
Nvidia just did what Intel/AMD should have done to threaten CUDA ecosystem - release a "cheap" 128GB local inference appliance/GPU. Well done Nvidia, and it looks bleak for any AI Intel/AMD efforts in the future.
mft_ · 8 months ago
I think you nailed it. Any basic SWOT analysis of NVidia’s position would surely have to consider something like this from a competitor - either Apple, who is already nibbling around the edges of this space, or AMD/Intel who could/should? be.

It’s obviously not guaranteed to go this route, but an LLM (or similar) on every desk and in every home is a plausible vision of the future.

iszomer · 8 months ago
Nvidia also brought Mediatek into the spotlight..

Deleted Comment

mrtksn · 8 months ago
Okay, so this is not a peripheral that you connect to your computer to run specialized tasks, this is a full computer running Linux.

It's a garden hermit. Imagine a future where everyone has one of those(not exactly this version but some future version), it lives with you it learns with you and unlike the cloud based SaaS AI you can teach it things immediately and diverge from the average to your advantage.

Topfi · 8 months ago
I'd love to own one, but doubt this will go beyond a very specific niche. Despite there being advantages, very few still operate their own Plex server over subscriptions to streaming services, and on the local front, I feel that the progress of hardware, alongside findings that smaller models can handle a variety of tasks quite well, will mean a high performance, local workstation of this type will have niche appeal at most.
mrtksn · 8 months ago
I have this feeling that at some point it will be very advantageous to have personal AI because when you use something that everyone can use the output of this something becomes very low value.

Maybe it will still make sense to have your personal AI in some data center, but on the other hand, there is the trend of governments and mega corps regulating what you can do with your computer. Try going out of the basics, try to do something fun and edge case - it is very likely that your general availability AI will refuse to help you.

when it is your own property, you get the chance to overcome restrictions and develop the thing beyond the average.

As a result, having something that can do things that no other else can do and not having restrictions on what you can do with this thing can become the ultimate superpower.

noduerme · 8 months ago
"garden hermit" is a very interesting and evocative phrase. Where is that from?
mrtksn · 8 months ago
It's a real thing: https://en.wikipedia.org/wiki/Garden_hermit

In the past, in Europe, some wealthy people used to look after of a scholar living on their premises so they can ask them questions etc.