Global Memory Shortage Crisis: Market Analysis

"However, this is not just a cyclical shortage driven by a mismatch in supply and demand, but a potentially permanent, strategic reallocation of the world’s silicon wafer capacity. [...] This is a zero-sum game: every wafer allocated to an HBM stack for an Nvidia GPU is a wafer denied to the LPDDR5X module of a mid-range smartphone or the SSD of a consumer laptop."

I wonder if this will result in writing more memory-efficient software? The trend for the last couple of decades has been that nearly all consumer software outside of gaming has moved to browsers or browser-based runtimes like Electron. There's been a vicious cycle of heavier software -> more RAM -> heavier software but if this RAM shortage is permanent, the cycle can't continue.

Apple and Google seemed to be working on local AI models as well. Will they have to scale that back due to lack of RAM on the devices? Or perhaps they think users will pay the premium for more RAM if it means they get AI?

Or is this all a temporary problem due to OpenAI's buying something like 40% of the wafers?

0xbadcafebee · 3 months ago

> I wonder if this will result in writing more memory-efficient software?

If the consumer market can't get cheap RAM anymore, the natural result is a pivot back to server-heavy technology (where all the RAM is anyway) with things like server-side rendering and thin clients. Developers are far too lazy to suddenly become efficient programmers and there's plenty of network bandwidth.

PostOnce · 3 months ago

Developers would prefer to write good software, the challenge and the craftsmanship are a draw.

However, the customers do not care and will not pay more so the business cannot justify it most of the time.

Who will pay twice (or five times) as much for software written in C instead of Python? Not many.

ganoushoreilly · 3 months ago

This is by design. Rent your computer.. don't buy! Use Geforce Now!

zozbot234 · 3 months ago

There's plenty of scope for local AI models to become more efficient, too. MoE doesn't need too much RAM: only the parameters for experts that are active at any given time truly need to be in memory, the rest can be in read-only storage and be fetched on demand. If you're doing CPU inference this can even be managed automatically by mmap, whereas loading params into VRAM must currently be managed as part of running an inference step. (This is where GPU drivers/shader languages/programming models could also see some improvement, TBH)

dummydummy1234 · 3 months ago

But aren't the experts chosen on a token by token basis, which means bandwidth limitations?

imtringued · 3 months ago

MoE works exactly the opposite way you described. MoE means that each inference pass reads a subset of the parameters, which means that you can run a bigger model with the same memory bandwidth and achieve the same number of tokens per second. This means you're using more memory in the end.

duped · 3 months ago

It's not a zero sum game because silicon wafers are not a finite resource. Industry can and will produce more.

tokioyoyo · 3 months ago

If industry has a bit of fear that the demand will slow down by the time they can output meaningful amount of chips, then probably not. Time will show.

hackable_sand · 3 months ago

Neither are paperclips.

AtlasBarfed · 2 months ago

I'm waiting for the good AI powers software.... Any day now.

Ideally, llm should be able to provide the capability to translate from memory inefficient languages to memory efficient languages, and maybe even optimize underlying algorithms in memory use for this.

But I'm not going to hold my breath

krupan · 3 months ago

This is a temporary problem driven by the AI bubble. It's going to hurt until the bubble pops, but when that happens other things are going to hurt

treebeard901 · 3 months ago

Code the AI will produce will solve the memory usage problems which is itself a result of lazy or poor human coders.

xocnad · 3 months ago

Nice assertion. Perhaps you meant that AI could be directed towards less memory intensive implementations. That would still have to be directed by those same lazy/poor coders because the code the AI is learning from is their bad code (for the most part).

to11mtm · 3 months ago

IDK, given the prevalence of Electron and other technically-correct-but-inefficient code out there, at bare minimum it would require decent prompting to help.

antisthenes · 3 months ago

> There's been a vicious cycle of heavier software -> more RAM -> heavier software but if this RAM shortage is permanent, the cycle can't continue.

What do you mean it can't continue? You'll just have to deal with worse performance is all.

Revolutionary consumer-side performance gains like multi-core CPUs and switching to SSDs will be a thing of distant past. Enjoy your 2 second animations, peasant.

One of the things I’ve been hoping for every time a new EC2 instance comes out is for them to unpin the memory:core ratio a bit. I don’t expect they have enough r# and c# users to completely balance things out so what they’re really doing is selling people more CPUs to get the memory they need.

It would be nice if it were creeping up generation to generation. But if this keeps up I fear the opposite.

zozbot234 · 3 months ago

The best way of "unpinning" those ratios for many ephemeral workloads is to use Lambda/FaaS, not EC2.

Kodiack · 3 months ago

For many ephemeral workloads, sure, but that comes at the expense of generally worse and less consistent CPU performance.

There are plenty of workloads where I’d love to double the memory and halve the cores compared to what the memory-optimised R instances offer, or where I could further double the cores and halve the RAM from what the compute-optimised C instances can do.

“Serverless” options can provide that to an extent, but it’s no free lunch, especially in situations where performance is a large consideration. I’ve found some use cases where it was better to avoid AWS entirely and opt for dedicated options elsewhere. AWS is remarkably uncompetitive in some use cases.

hinkley · 3 months ago

kv stores also exist because for many generations of tooling it was faster to manage read-mostly data off-heap instead of on, and that becomes more true the more processes you run doing jobs that touch the same data.

electroly · 3 months ago

Note: the scale does go further than "r" on the high-memory end with some specialty "x" families.

    c*: 2GB per vCPU
    m*: 4GB per vCPU
    r*: 8GB per vCPU
    x2idn/x8g: 16GB per vCPU (!)
    x2iedn/x2iezn/x8aedz: 32GB per vCPU (!)

hinkley · 3 months ago

Yeah those are pretty spendy. I know one comes with extra guaranteed bandwidth which is kind of handy if you’re sharing a small number of cache nodes among a lot of servers. But we were doing okay running r6 for cache, though my coworker who knew the ritual for migrating them did eventually get a little boost out of switching us to r7’s. The latency wasn’t great and I don’t think faster network cards would have helped that. There was already plenty of incentive for us to do per-request promise caching to avoid pulling the same keys multiple times in a request but that was necessary because the business model forced the architecture to tolerate nondeterminism. The cost per request was what eventually killed them (the economy dipped and customers ran to cheaper vendors), but I’ve never seen a company survive being stupid for as long as this place did.

Well, except IBM. Maybe Yahoo.

jeffbee · 3 months ago

You should just get used to it because the memory per core is going down inexorably forever until someone makes a physics breakthrough. We know how to print cores and the core count is going to keep going up.

hinkley · 3 months ago

We knew how to print memory long before we knew how to print cores.

Youden · 3 months ago

IAmGraydon · 3 months ago

Article completely misses the true cause of the price increase - Sam Altman/OAI made a deal with Samsung and SK Hynix get 40% of their RAM wafer production for the 2026 period. This was economic warfare against OpenAI's competitors, and the competitors along with the data centers responded by buying up every bit of DDR5 in sight. This price increase was engineered.

The deal was inked on October 1, 2025, and rumors of it started swirling in September. Take a look at the RAM price charts. Anyone who attributes this just to "AI growth" has no idea what they're talking about. AI has been growing rapidly for three years and yet this price increase just happened exactly when Altman signed this deal.

https://pcpartpicker.com/trends/price/memory/

It's also worth noting that IDC, who published this report, is wholly owned by Blackstone, who is also heavily invested in OpenAI. It would be prudent to be cautious about who you believe.

anon373839 · 3 months ago

Right! Raw wafers, not even memory. I have seen no evidence that this move was anything but a means of taking product out of the global supply chain.

That's correct and a good point I almost forgot about - they can't even utilize what they bought!

I hadn't heard of this, so I searched the web (without any help from an LLM, btw) and found this:

https://www.mooreslawisdead.com/post/sam-altman-s-dirty-dram...

aunty_helen · 3 months ago

After the dns entry, the stockpile of ram may be the most valuable asset that company has.

Doesn't Apple routinely do the same thing? Reserve chip production for the leading-edge nodes, and sometimes enter into similar deals for other tech such as displays? I'm not seeing any evidence that this was intentional "warfare" on OpenAI's part: they're just making a high-stakes bet that they can ultimately find a better and higher-margin use for that raw DRAM than HNers' gaming battlestations, or whatever the next-best use was when they made that deal.

No, apple does not buy production capacity to prevent others from using it. They buy it to use it themselves.

The wafers are not DRAM. This is more likely burning oil wells so your enemy can't use them. Wafers are to chips what steel blanks are to engines. You basically need clean rooms just to accept delivery and entire fabs to do anything. Someone who doesn't own a fab buying the wafers is essentially buying them to destroy them.

cedws · 3 months ago

Not sure that OpenAI's move was a very good one, they've just created a lot of enemies for themselves. I see comments all over the internet about AI slop making RAM expensive. It's going to eat into the profits of a lot of companies. People will be willing for this insanity to end.

We can only hope you are right

nebula8804 · 3 months ago

So what? Intel has backstabbed many yet they cruised for a very long time. Mismanagement eventually stopped that train but it took 15+ years. There are no nice guys. This is business.

I don't really get the panic. This is the same as the pandemic, just for different reasons. A change in demand is causing supply shortages and price hikes. But the demand will eventually swing back as the current demand is completely unsustainable. AI demand will crumble, prices will bottom out, and companies who bet big on AI & RAM will end up going into big layoffs triggering another recession and a huge market crash. We literally just went through all this, it can't be a surprise.

GoatInGrey · 3 months ago

On the AI correction point: given the K-shaped economy, we're headed for a mean reversion either way. Either the bottom corrects upward or the top corrects down. Real life being what it is, I'm bracing for the latter. Though I'd love to be proven wrong

refulgentis · 3 months ago

We are in mutual bafflement: this is just like COVID because the AI bubble will pop causing a recession and market crash?

From what I see in other comments, if you can confidently assert “AI bubble; no one will want GPUs soon” it makes sense, but the COVID stuff is a head scratcher.

RyanShook · 3 months ago

I see this as a competitive opportunity for Apple. If Apple smartphone specs improve while Androids stagnate it could create more iOS users.

The promised AI metaverse is still a long way off and in the meantime people still want the best smartphone.

kllrnohj · 3 months ago

How is it an opportunity for Apple? They are a customer of Samsung and Micron RAM modules just like everyone else is. They aren't in any unique position other than their user base is already used to paying extreme markup for RAM. Now whether Apple just eats the cost in their profit margin or charges even more for RAM remains to be seen.

krackers · 3 months ago

Their supply chain prowess likely means they have already secured contracts for 2026 (and maybe even 2027), so they will not be affected by the price hike. But maybe they'll still use it as an opportunity to bump prices and rake in free profit, who knows.

memoriuaysj · 3 months ago

I have no idea how much RAM my phone has.

And if you think that somebody buys an iPhone because they compare the specs with Android :)))))

Maxion · 3 months ago

Basically if I have to start comparing iPhone specs to Android phone specs I might aswell just buy an Android. The point of iOS is that you don't have to.

oezi · 3 months ago

The competitive advantage comes from Apple having the supply chain contracts in place to not be affected by the 2026 price hike as much. The Android phones will be more expensive and thus will capture less market share.

arcbyte · 3 months ago

I agree with you and have agreed with you for a long time. However, I definitely see the writing on the wall. More than one person in my circle have traditionally been Android users and the lack of innovation from both Apple and Android have them comparing devices on specs MUCH more. I include myself in this list on my next upgrade. I'll be looking largely at specs on the next upgrad because honestly there's not much day to day difference in usage between apple and android anymore

root_axis · 3 months ago

> if Apple smartphone specs improve while Androids stagnate it could create more iOS user

Nah. The marginal utility of more smartphone ram is near zero at this point. The vast majority of people wouldn't even notice if the memory in their phone tripled overnight.

layer8 · 3 months ago

Most regular people don’t care about RAM specs. And lately it’s Apple that has been rather stagnating in terms of features.

rockskon · 3 months ago

What, because they aren't shoving an LLM in every orifice of their product?

neutronicus · 3 months ago

I would love stagnation, the keyboard has always been a dumpster fire but these days it is an actively regressing dumpster fire

amluto · 3 months ago

> The voracious demand for HBM by hyperscalers, such as Microsoft, Google, Meta and Amazon, has forced the three biggest memory manufacturers (Samsung Electronics, SK Hynix, and Micron Technology) to pivot their limited cleanroom space and capital expenditure towards higher margin enterprise-grade components. This is a zero-sum game: every wafer allocated to an HBM stack for an Nvidia GPU is a wafer denied to the LPDDR5X module of a mid-range smartphone or the SSD of a consumer laptop.

> As a result, IDC expects 2026 DRAM and NAND supply growth be below historical norms at 16% year-on-year and 17% year-on-year, respectively.

This is an odd claim. It’s like saying that car companies historically produced more coupes than sedans, but suddenly there are new enormous orders for millions of sedans. All cars get massively more expensive as a result — car makers charge 50-200% more than before. Sure, they need to retool a little bit and buy more doors, but somehow the article claims that “limited … capital expenditure” means that overall production will grow more slowly than historical rates?

This only makes sense either on extremely short timescales (as retooling distracts form expansion) or if the car makers decide not to try to compete with each other. Otherwise some of those immediately available profits would turn into increased capital expenditure and more RAM would be produced. (Heck, if RAM makers think the new demand is sustainable, they should be happy to increase production to sell more units at current prices.)

christophilus · 3 months ago

We’ll see. The current situation is an anomaly caused in part by the AI boom generally, in part by the OpenAI shenanigans. This time will pass, and what comes after is hard to know. If the shortages are sustained long enough, innovation will happen. The economics of it will bring new players into the market.

DRAM is a notoriously cyclical market, though, and wise investors are leery of jumping into a frothy top. So, it’ll take a while before anyone decides the price is right to stand up a new competitor.

kristianp · 3 months ago

> potential contraction in the global smartphone market alongside an increase in average selling prices (ASP). In 2026, in our moderate downside scenario, we could see the market contract by 2.9%. In our pessimistic downside scenario, it could be as bad as 5.2%.

> PC market contract by 4.9% compared with a 2.4% year-on-year decline in the November forecast. Under a more pessimistic scenario, the decline could deepen to 8.9%.