Readit News logoReadit News
homarp · a month ago
see https://news.ycombinator.com/item?id=45923188 for HipKittens discussion
sorenjan · a month ago
See also this post about the same work: HipKittens: Fast and furious AMD kernels [0], with comments from George Hotz and AMD employees.

[0] https://news.ycombinator.com/item?id=45923188

DeathArrow · a month ago
I think many people tried making AMD GPU go brrr for the mass of the developers but no one succeeded.

I don't get why AMD doesn't solve their own software issues. Now they have a lot of money so not having money to pay for developers is not an excuse.

And data centers GPUs are not the worst. Using GPU compute for things like running inference at home is a much, much better experience with Nvidia. My 5 years old RTX 3090 is better than any consumer GPU AMD released up to this date, at least for experimenting with ML and AI.

cyberax · a month ago
I recently switched from an NVidia card (5090) to a couple of AMD cards (R9700 32GB) for my inference server.

I must say it's been a completely positive experience. The mainline Fedora kernel just worked without any need to mess with the DKMS. I just forwarded /dev/dri/* devices to my containers, and everything worked fine with ROCm.

I needed to grab a different image (-rocm instead of -cuda) for Ollama, change the type of whisper build for Storyteller. And that was it! On the host, nvtop works fine to visualize the GPU state, and VAAPI provides accelerated encoding for ffmpeg.

Honestly, it's been an absolutely pleasant experience compared to getting NVidia CUDA to work.

logicchains · a month ago
> Now they have a lot of money so not having money to pay for developers is not an excuse.

NVidia is the exception to the rule when it comes to hardware companies paying competitive salaries for software engineers. I imagine AMD is still permeated by the attitude that software "isn't real work" and doesn't deserve more compensation, and that kind of inertia is very hard to overcome.

Dead Comment

jacobgorm · a month ago
And the developer experience is horrible when working with AMD. They don’t even accept driver crash bug reports.
donaldihunter · a month ago
People say that as if the Nvidia experience is better. Nvidia also has a horrible developer experience.
Nathanba · a month ago
I just saw that Nvidia even maintains their own fork of Unreal Engine. AMD isn't even competing.
moomin · a month ago
nVidia has been deeply involved in the software side, first with gaming, forever. It’s written into their DNA. Even when ATI/AMD could outperform them in raw hardware, nVidia worked well with every last game and worked with individual developers even writing some of their code for them.
skeptrune · a month ago
I appreciate that there are people in academia working on this problem, but it seems like something AMD would have to fix internally if they were serious.
amelius · a month ago
I personally prefer the hardware companies making just hardware.

Keeps the incentives pure.

I'm even willing to accept a 20% performance hit for this requirement, should someone bring that up.

jack_tripper · a month ago
>I personally prefer the hardware companies making just hardware. Keeps the incentives pure.

That's self contradictory. Their incentive is to sell more HW and at higher prices using whatever shady practices they can get away with, software or no software. There's nothing pure about that, it's just business. High end chips aren't commodity HW like lawnmowers, they can't function without the right SW.

And this isn't the 90's anymore when Hercules or S3 would only make the silicon, and then system integrators would write the drivers for it which was basically MS-DOS calls to read/write to registers via the PCI bus, by the devs reading a 300 page manual, those days are long gone. Modern silicone is orders of magnitude more complex that nobody else besides the manufacturer could write the drivers for it to extract the most performance out of it.

>I'm even willing to accept a 20% performance hit for this requirement, should someone bring that up.

I'm also willing to accept arbitrary numbers I make up, as a tradeoff, but the market does not work like that.

andruby · a month ago
Unfortunately hardware can’t exist anymore without software. Everything non-trivial needs firmware or microcode.

And depending on others to write firmware for your hardware, I don’t think that’s a recipe for success.

matt-p · a month ago
That means 25% more datacentre/grid capacity. Genuinely I think most companies are not happy to fund that, in order to save marginally in other areas.
ngcc_hk · a month ago
Is apple a hw or sw … or is that a wrong question. Why is a company has to be a hw or sw one ?

If nvidia dominate because of CUDA and why it can do it but amd should not?

tester756 · a month ago
You alone is... pretty small market niche, I'd say.
musebox35 · a month ago
In certain contexts 20% is a lot bucks, leaving that on the plate would be very wasteful ;-)
_zoltan_ · a month ago
20% performance on a 10GW DC? Suuuuuuure....
aabhay · a month ago
Except that this same team built a similarly named software package for Nvidia GPUs as well. It’s bright researchers doing what they do best if you ask me.
sigmoid10 · a month ago
Except that this other package also only came out last year and has contributed zero to Nvidia's current status. If AMD ever wants to be taken seriously in this market, they will need to start making their own software good instead of relying on "open source" in the mistaken belief that someone else will fix their bad code for free. Nvidia spent more than a decade hiring top talent and getting their proprietary software environment right before they really took off. And some of the older ML researchers here will certainly remember it wasn't pain-free either. But they didn't just turn the ship around, they turned it into a nuclear aircraft carrier that dominates the entire world.
_zoltan_ · a month ago
Honestly they should be hired by NVIDIA or AMD.
reactordev · a month ago
Fully agree. They punted 10 years ago and are now playing catchup. They have the hardware but can’t manage to unlock its full potential due to them not knowing how to write firmware that does.
Ecko123 · a month ago
AFAIK, they are already doing it at various levels including working with tinycorp
colordrops · a month ago
It's insane to me that AMD is not spending billions and billions trying to fix their software. Nvidia is the most valuable company in the world and AMD is the only one poised to compete.
aabhay · a month ago
They are, but the problem is that shifting an organization whose lifeblood is yearly hardware refreshes and chip innovation towards a ship-daily software culture is challenging. And software doesn’t “make money” the way hardware does so it can get deprioritized by executives. And vendors are lining up to write and even open source lots of software for your platform in exchange for pricing, preference, priority (great on paper but bad for long term quality). And your competitors will get ahead of you if you miss even a single hardware trend/innovation.
keyringlight · a month ago
There was a podcast episode linked here a while ago about how the software industry in Japan never took off as it did in America and it was a similar conclusion. According to the host, the product being sold was hardware, and software was a means to fulfill and then conclude the contract. After that you want the customer to buy the new model, primarily for the hardware and software comes along for the ride.

It should be obvious by now though that there's symbiosis between software and hardware, and that support timescales are longer. Another angle is that it's more than just AMD's own software developers, also the developers making products for their customers who in turn buy AMD's if everyone works together to make them run well and it's those second developers they need to engage with in a way their efforts will be welcomed.

nikanj · a month ago
Hardware is a profit center, software is a cost center, and they get treated accordingly
david-gpu · a month ago
I worked at at a number of GPU vendors, and it felt like Nvidia was the only one that took software as an asset worth investing in, rather than as a cost center. Massively different culture.
LarsDu88 · a month ago
From this writeup it does sound like the architecture of the AMD gpu makes it a bit harder to optimize. It also seems like long term, the AMD approach may scale better in the long run. 8 chiplets rather than 2 for the nvidia offering, along with all the associated cache and memory locality woes.

The future will probably see more chiplets rather than less, so I wonder if dealing with complexity here will pay more dividends in the long run

WithinReason · a month ago
AMD doesn't need warp specialisation for high performance while nvidia does, which simplifies programming AMD
boxerab · a month ago
This is a great project, but the bigger question is: why isn't AMD doing this themselves? It continues to boggle my mind how much they don't seem to get the importance of a mature software stack when it is so obviously the key to the success of team red. A stack that can be used for EVERY card they produce, like CUDA, not just a select few. I used to believe that AMD the underdog would catch up some day, but I've more or less given up on them.
alex1138 · a month ago
It's not my favorite internet meme but I'm tickled to see "go brr" on a website/university like Stanford
microtonal · a month ago
They already "went brr" when they announced ThunderKittens a year ago: https://hazyresearch.stanford.edu/blog/2024-05-12-tk
Skunkleton · a month ago
This meme is tired. Let it rest boss.
rightbyte · a month ago
Usually a sign that it is not cool anymore and the kids need to make something new up.