There have already been several ChatGPT outages caused by a lack of compute capacity. Azure literally cannot buy Nvidia GPUs fast enough to satisfy customer demand. They are forced to buy alternatives here. There's no real decision to make here.
Similar to the old Slashdot days, HN is still plagued with just outright anti-Microsoft people under any circumstance, even in cases where there is no justification.
A more recent thing is just pretending Microsoft (and especially Azure) doesn’t even exist.
If pressed, sure, people will admit that these may not be entirely imaginary entities, but if listing technologies or platforms then “oops” they’ll just forget.
The best example I saw was a “poster” of cloud big data technologies. Started with Amazon S3, went through Google Bigtable, and then in the corners had companies so small that their own marketing page is the only search result. No mention of Azure anywhere.
There were dozens of logos on that page representing companies with annual revenues smaller than what one my customers spent on a single
Azure Storage Account by accident.
The talking heads on CNBC would probably mention their synergy of "software, drivers, APIs, firmware the resources" but as a user of their end results - once bitten, twice shy.
Thanks, so I think you're saying that there are two related things happening that are eliminating NVIDIA's CUDA moat: (1) Developers are mostly using libraries instead of writing "to the metal" (CUDA), and (2) many popular libraries have added support for AMD GPUs. Does that capture it?
Now that AI made Nvidia a trillion dollar company AMD has finally woken up and realized what they should have ten years ago: that they need to invest in the software side of things more. There is movement now, and you can actually do some AI things on AMD hardware, but it will take a long time for them to catch up to Nvidia.
You seem to have forgotten where AMD was 10 years ago. Company was circling the drain but they should have invested heavily in the software stack for hardware they could barely afford to design? Brilliant strategy. Now's the time to invest because they finally have the resources. The ai train is not going anywhere anytime soon
It's also a matter of price. Nvidia sells their top GPUs at insane premiums. For Microsoft which is such a huge player and the major AI vendor and owns a lot of OpenAI they have no point into cornering themselves into Nvidia dependence.
They benefit from competition, not from bending to one vendor.
That's the joke. It still is. ROCm is nowhere near production-ready and if MS thinks devs will want to waste their time on random errors then good luck with that business. AMD cards are also super expensive so not sure what their competitiveness is supposed to be?
The MI300 is a massive GPU compared to even the new H200.
It was designed as a combined CPU/GPU for supercomputers, with shared memory. But then the AI craze hit, so AMD spun a variant into a pure GPU AI accelerator real quick, which they could actually pull off because the GPU silicon is modular.
...So thats why it cost a fortune. Its really a jury rigged HPC product.
Microsoft can develop their own software. We are talking a company that places billions in orders to Nvidia, they can afford to develop the tools and obviously AMD will throw millions and millions in support to court such a customer.
Microsoft has the know-how from hardware and software, drivers, APIs, firmware the resources, it's a major AMD customer in Cloud and consumer devices.
Do people think Microsoft and AMD will watch Nvidia cannibalize the market snd Microsoft writing cheques for whatever Nvidia demands?
It's like people forget a major economic rule: when margins are high you will attract competition.
There have already been several ChatGPT outages caused by a lack of compute capacity. Azure literally cannot buy Nvidia GPUs fast enough to satisfy customer demand. They are forced to buy alternatives here. There's no real decision to make here.
https://www.oracle.com/news/announcement/oracle-cloud-infras...
If pressed, sure, people will admit that these may not be entirely imaginary entities, but if listing technologies or platforms then “oops” they’ll just forget.
The best example I saw was a “poster” of cloud big data technologies. Started with Amazon S3, went through Google Bigtable, and then in the corners had companies so small that their own marketing page is the only search result. No mention of Azure anywhere.
There were dozens of logos on that page representing companies with annual revenues smaller than what one my customers spent on a single Azure Storage Account by accident.
The talking heads on CNBC would probably mention their synergy of "software, drivers, APIs, firmware the resources" but as a user of their end results - once bitten, twice shy.
Deleted Comment
Intel is taking a slightly different approach, and is going for "PyTorch compatible."
You will hear endless negative anecdotes about ROCm/OpenVINO, but they both do seem to be getting better with each update.
(hello!)
They benefit from competition, not from bending to one vendor.
Deleted Comment
https://twitter.com/sama/status/1724626002595471740
ROCm has also made a lot of advances in recent times.
https://www.databricks.com/blog/training-llms-scale-amd-mi25...
It was designed as a combined CPU/GPU for supercomputers, with shared memory. But then the AI craze hit, so AMD spun a variant into a pure GPU AI accelerator real quick, which they could actually pull off because the GPU silicon is modular.
...So thats why it cost a fortune. Its really a jury rigged HPC product.