Asking “what day is today” vs “create this api endpoint to adjust the inventory” will cost vastly different. And honestly I have no clue where to start to even estimate the cost unless I run the query.
NVDA and cloud providers that are highly staked in AI would likely take a hit if a 70B model could do what Sonnet4 does.
But overall, AI is here to stay even if the market crashes, so it’s not really a AI bubble pop, more like a GPU pop.
And even then, GPU will still be in demand since the training will still need large clusters.
Presumably Java would also be pretty tiny if we wrote it in bytecode instead of higher lever Java.
In a way this is saying that there are some GPUs just sitting around so they would rather get 50% than nothing for their use.
Since LLM context is limited, at some point the LLM will forget what was defined at the beginning so you will need to reset/ remind the LLM whats in memory.
That’s where we are in the AI journey in 2025. The year 2000.