We're in the brute force phase of AI – once it ends, demand for GPUs will too

emehex · a year ago

May I interest you in Jevons paradox:

> In economics, the Jevons paradox occurs when technological progress increases the efficiency with which a resource is used (reducing the amount necessary for any one use), but the falling cost of use induces increases in demand enough that resource use is increased, rather than reduced.

Source: https://en.wikipedia.org/wiki/Jevons_paradox

rachofsunshine · a year ago

As a concrete application: software development has gotten infinitely easier in recent decades. Better IDEs, fewer performance constraints, better virutalization, less worries about on-prem deployments, autocomplete, easy version control, you name it. Any given engineer should be orders of magnitude more productive - yet demand has (recent slump aside) only grown.

visarga · a year ago

And computers have become 1 million times better in the last 2-3 decades. Higher frequency, more RAM, more bandwidth, more network peers. And yet we make even more of them.

> "I think there is a world market for maybe five computers." Thomas Watson, president of IBM, 1943

rightbyte · a year ago

> software development has gotten infinitely easier in recent decades.

I don't agree here. It was way simpler in the 90s. The programmer experience probably peaked around the transition from TUI:s to win32 where you could do either. Different screen resolutions is probably what made programming gui:s suck. And all the churn of Microsoft and Oracle frameworks didn't help.

Nowadays the overhead of making an app that passes procurement is insurmountable. And consumers seem to not buy apps at full price anymore.

dmitrygr · a year ago

> software development has gotten infinitely easier

Writing good software is as hard as it has ever been. IDEs don’t help you with anything that makes proper software difficult. The only thing that has changed is that users have been conditioned to accept shit.

incrudible · a year ago

> Better IDEs

We are now getting to a point where IDEs are as good as the ones we had in the 90s.

> performance constraints

Evened out by higher fidelity and less efficient programming languages and paradigms.

> deployment

More robust, perhaps, but also much more complex.

> version control

An improvement in some respects, a regression in others.

> more productive

Hard constraints make people productive. Being productive is about what not to do, impossibilities make for easy decisions.

rvense · a year ago

Typing on a typewriter is five-six times faster than handwriting. Imagine if we still spent all our time writing! How silly that would be.

amy-petrik-214 · a year ago

As another concrete application, roads:

https://bangaloremirror.indiatimes.com/opinion/others/easyno...

The idea is more lanes on the highway means more traffic means slower (even though you added a lane!). What goes off of this is what's called "Palin's Corrolary", which is to make traffic faster it's best to have fewer lanes. Politicians apply various techniques for this such as perpetual construction or allocating vast swaths of asphalt for bicycles, to make the traffic flow faster.

So it does make sense that in fact slower chips will make AI faster, and punch cards will make software development faster, as the inverse of these proposed trends.

meiraleal · a year ago

Productivity which is cancelled by the procrastination and mental issues social media brought to us all.

codesnik · a year ago

well, it's time for another abstraction layer.

binalpatel · a year ago

My personal (overly biased view after reading Chip War recently) take is pretty much there, seems like a lot of the same early dynamics of semiconductors are playing out here.

Very large R&D expenditures for the next iterations of the models at the leading edge (the "fabs" of the world), everything downstream getting much cheaper and better with demand increasing as a result.

Like a world where Claude Opus 3.5 is incredibly expensive to train and run, but also results in a Claude Haiku that's on net better than the Opus of the prior generation, occurring every cycle.

skzv · a year ago

One of my favourite economic paradoxes. It changes the way you think about efficiency and consumption.

My colleague introduced me to this idea. He had been studying ways to increase computing efficiency out of concern for the environment. Making programs more efficient would reduce energy consumption, right?

His advisor introduced him to Jevons paradox and he realized such efforts could have the exact opposite effect. So he dropped that research entirely. If you're worried about energy consumption, you need to make energy production more green, not machines more efficient.

Making data centers more efficient will probably cause us to build more data centers and use more power overall, not less.

01HNNWZ0MV43FF · a year ago

Unless it's something with a relatively fixed demand

SoftTalker · a year ago

Sounds like a variant on the "induced demand" theory that people who are opposed to road building always trot out.

J_Shelby_J · a year ago

Well, we’re not going to roll it out.

But it’s not really a theory so much as an established fact that the only way to reduce traffic is to have viable alternatives to driving.

obelos · a year ago

If you have ever not gone somewhere because “there's too much traffic” or chosen to go to a store because it has easier parking than an equivalent alternative store, you've experienced the rudiments of induced demand.

jessriedel · a year ago

It is, but in both cases it's not a good reason by itself to reduce investment in supply (roads or GPUs).

rcxdude · a year ago

One is indeed an example of the other

Dead Comment

hangsi · a year ago

Interesting to consider GPUs as the coal of the AI revolution.

This is worth it for the mental image of heaping them into a boiler fire by the shovel load alone.

fragmede · a year ago

Mental image? This is the AI revolution.

https://imgur.com/a/ePGD89d

EasyMark · a year ago

This definitely made me laugh after reading endless complaint (well scrolling) threads on the price and unavailability of cheap graphics cards thanks to $coin mining and AI usage. I would love to shovel a few in the fire to produce energy for the next generation of overpowered cards while I play old schools games on my $250 laptop

EasyMark · a year ago

I bring this up all the time with coworkers. When a new generation of processors come out with amazing speed/#of core/power improvements, developers get lazier. I’m all for meaningful improvements, and grudgingly on the side of stuff like electron that allow easy cross platform dev, but please, for the love of God, please quit stacking on garbage features and useless GUI mods, pointless graphics, endless pulling in of huge libraries to do one little thing, etc. I try my best to keep my C++ and Rust dev as small as possible with as few dependencies as possible. If something might take me more than a week to write myself, I’ll give it strong consideration, otherwise I write it myself.

Deleted Comment

Dead Comment

habitue · a year ago

I disagree, I think the fluke was the era in which we didn't have enough work to do for CPU/GPU to be at 100% 24/7. Think of this like leaving money on the table: compute should always be useful, why aren't those cores pegged to max all the time?

I think it was literally lack of imagination. We were like "well, I automated most of the paper pushing we used to do in the office, guess my job is done!" and this occupied 0.001% of a computer's time. We invented all sorts of ways for people to only pay for that tiny slice of active time (serverless, async web frameworks, etc).

Now we're in an era where we can actually use the computers we've built. I don't think we're going back

dgacmu · a year ago

Even once you have a core sitting in a datacenter, the value you derive from the computations must exceed the power cost (and the SWE cost to run that computation). Otherwise you're better off turning it off.

I have a stack of 1080ti and Titan V GPUs that are testament to this. :-) (which, admittedly, I should sell)

habitue · a year ago

This is where the lack of imagination comes in (not you in particular, this is everyone right now). I'm postulating something non-obvious and pretty contentious, but I think compute should always be more valuable than the cost of the power it consumes.

reaperducer · a year ago

I have a stack of 1080ti and Titan V GPUs that are testament to this. :-)

https://en.wikipedia.org/wiki/SETI@home

lostmsu · a year ago

Don't, I'm working on a startup where you running them would be welcome (and paid): https://borg.games/setup

eli_gottlieb · a year ago

Well if you're looking to sell, get in touch!

crubier · a year ago

Yeah, I was going to say just that. The statement above is like saying: "If you have an oven, it should be burning at full power all the time. All the time with your oven not burning is wasted time". No. I need an oven to burn at high power when I bake, and it's fine for it to sit idle the rest of the time.

swatcoder · a year ago

How is that different than suggesting that all the engines we've ever made should constantly be driving load?

It may not be obvious in the same ways as engines, but computation consumes a resources like power and attention and outputs waste like heat and fried circuits. These resources and wastes interact with other systems than just computers and data centers and so efficiency and necessity needs to be considered in a bigger picture than just "let's use all the transistors all the time and see what happens!"

dleeftink · a year ago

But 100% compute != 100% efficiency? 'Pegging' cores to their maximum for the sake of utilisation seems wasteful where (for example) a simple bash script would have sufficed.

Once the compute heavy pathways have established, I'd wager the next round of automation can utilise these established paths to lock-in on an answer rather than throwing more cycles at the problem.

boredtofears · a year ago

Don't worry -- I can write bash scripts that eat all CPU too!

yjftsjthsd-h · a year ago

> why aren't those cores pegged to max all the time?

Power consumption and heat dissipation, mostly. Amusingly, heat even turns into a performance thing itself, since you can get better performance in bursts than sustained.

pipo234 · a year ago

There definitely was an era of compute surplus — compute being a solution in search of a problem. I think the recent crypto hype (bitcoin, NFTs) can (at least partially) be attributed to the same issue.

For sure, once the LLM hype diminishes to more practical scale we'll figure something else to burn cycles on. And I'm not predicting demand for GPUs to die out all of a sudden, but my money isn't on the nVidias of this world.

bubaumba · a year ago

> For sure, once the LLM hype diminishes

It haven't even started. There are so many places it can be used, expect sci-fi in the next few years. There is no way back. The only thing that may change is the AI technology under the hood. I mean LLM isn't the goal, it's only a tool. Something else may replace or extend it. Many people are actively working on this. GPUs aren't the goal either, there must be more efficient way. But still they are very good at numbers crunching and that is needed for video processing.

bluGill · a year ago

Effecience is NOT a goal. I want my computers ready when I want to do something. If you want effecency we should put them in universities only and make sure there is always a line of people waiting for their turn.

i want inefficient so that if I feel like a large calculation I have one ready at hand.

Ekaros · a year ago

Why are computers different from your car or your house? Shouldn't you hire someone to drive other people around in your car 24/7? And why not fill to whatever limit fire department sets your house to 24/7?

There is so many resources we keep available and do not fully utilize. I don't see why computing devices should be different...

throwaway4aday · a year ago

I'm going to tap the sign again:

  - [X] Text
  - [X] Images
  - [X] Audio
  - [ ] Videos (in progress)
  - [ ] 3D Meshes and Textures (in progress)
  - [ ] Genetics (in progress)
  - [ ] Physics Simulation (in progress)
  - [ ] Mathematics
  - [ ] Logic and Algorithms aka Planning and Optimization
  - [ ] Reasoning
  - [ ] Emotion
  - [ ] Consciousness

We still have a lot of data to crunch but it's not nearly enough so we're also going to have to collect and generate a lot more of it. Some of these items require data that we don't even know how to collect yet. Barring some kind of disastrous event, draconian regulation, or politically/culturally motivated demonization of ML I don't see GPU demand dropping any time soon.

pipo234 · a year ago

Extrapolation is dangerous. People tend to overestimate what's possible in a year while underestimating the possibilities next decade. So far we're mostly seeing huge investments hoping for a short term goals while we're not sure whether long term goals (like: Logic and Algorithms aka Planning and Optimization) would even benefit from more compute. Maybe, yes.

Shedding some hindsight on earlier extrapolations — The billions pored into the metaverse or self-driving didn't yield the results we expected in the period we expected.

throwaway4aday · a year ago

While I agree that we don't want to extrapolate too much I disagree that this type of exploration may not benefit from more compute. We won't know until we try and since we have what seems to be a very generalizable architecture it makes sense to take the brute force approach of creating models of that data by scaling the amount of data and the amount of compute we dedicate to it. If it turns out not to work then we've learned something. As it turns out, Logic and Algorithms has seen some early success using Transformers (Searchformer) https://arxiv.org/abs/2402.14083

user432678 · a year ago

“Completely autonomous self-driving cars next year”, — every self-driving startup CEO in 2015. As someone said in the comments above, it’s a miserable way of life, but I’m still very pessimistic about this extrapolation.

fragmede · a year ago

We can quibble where Waymo falls on the "completely autonomous" scale of things, but self driving cars are here, 9 years after 2015.

throwaway4aday · a year ago

I didn't say anything about a timeline for _solving_ these, just that the short timeline for drop off in demand for compute is unfounded since there is still so much ground to cover. The article takes the shortsighted view that the current state of text generation feels like it's in a lull (I strongly disagree with this for a variety of reasons chief among them being that 1. the supposed stall in progress hasn't gone on long enough to call it and 2. the big players are all focusing on productization and making the current SOTA as cheap as possible to improve their bottom line and expand its applicability) but there are a large number of other domains and sub-domains where these techniques can be applied and will likely see similar rapid advances as the amount of available data increases.

jononor · a year ago

The virtual world parts such as video and 3d, and plausible physics, I think we are going to do as well as image/audio/text within the next 10 years. Maybe even 5. These things primarily need to be believable and mostly-not-directly-wrong to serve a lot of usecases. And the way to get it to that level seems to be "just train it on Internet scale amounts of data".

Whether we will really have cracked the physical world connections, of physics, genetics, etc that we can use it to make physical products, changes etc I am less sure. Many usecases like medicine require not just correctness, but also a degree of verifiability. It is being worked on a lot, with many promising results. But the just-scale-the-training data strategy seems less viable here, both because relevant data is less prevalent and may not give the level of correctness.

mschuster91 · a year ago

I'd also add protein folding and interactions to the list of "pretty much solved" after AlphaFold 2 [1].

[1] https://deepmind.google/technologies/alphafold/

Yizahi · a year ago

Some of these are not like the others

danielmarkbruce · a year ago

Plus we will re-crunch it a bajillion times and run inference a bajillion times.

CharlieDigital · a year ago

We're really at the cusp of gen AI and we've barely scratched the surface.

Two Reddit threads really highlight this.

- ~10 years ago: https://www.reddit.com/r/StableDiffusion/comments/y9zxj1/you...

- Today: https://www.reddit.com/r/StableDiffusion/comments/1f0b45f/fl...

The upgrade in throughput from GPT-4 to GPT-4o and GPT-4o Mini actually unlocked use cases for the startup I'm at.

People that think demand for GPU compute capacity is going to decrease are probably wrong in the same way that people who thought the demand for faster processors and more RAM would wane were wrong. We are just barely at the start of finding the use cases and how to eat those GPU cycles.

    > The need for specialist hardware, he observed, is a sign of the "brute force" phase of AI, in which programming techniques are yet to be refined and powerful hardware is needed. "If you cannot find the elegant way of programming … it [the AI application] dies," he added.

The thing is that even if there is an elegant and efficient programmatic/algorithmic solution, having more and faster hardware only makes it better and pushes the limits even more.

dartos · a year ago

> We're really at the cusp of gen AI and we've barely scratched the surface.

What makes you say that? I don’t really see a trend of AI generated content getting better, just more players in the space.

I think we’re at the peak of AI gen, I doubt we’ll see much improvement in quality (it’s already pretty good and it seems like all the low hanging fruit is gone), just more specialized models. Maybe some better tooling to give artists more control

having seen it grow more and more since 2016 when GANs started making fairly realistic human faces, this seems like the end goal already.

HarHarVeryFunny · a year ago

From a machine learning perspective, "generative AI" embraces any generative method, so includes diffusion-based image generation as well as text-generating LLMs, but in the world of C-suite execs "GenAI" really refers to LLMs which they dream will replace developers, customer service agents, etc, etc.

marcosdumay · a year ago

Both yours and the GP's are weird things to believe about a problem.

You seem to be talking about a technical solution, but naming it after the problem.

CharlieDigital · a year ago

    > I don’t really see a trend of AI generated content getting better

You see that second link as the endpoint? That there's nowhere to go from there? How about you can have a holodeck type experience with Apple Vision Pro? Literally generate any scenario you want? Download generated scenarios and customize it however you want in real time?

Entire animation workflows changed from animating models to using voice and text to describe scenes and actions.

Lowering the barrier of digital film making to the same level and ease of use as photo editing apps today -- even easier.

You really think that the second link is the peak of gen AI? You really think that nothing else and no more major industry shifts are going to happen when gen AI gets cheaper, faster, algorithms get better, and hardware gets more powerful?

fwip · a year ago

By "gen AI" do you mean "generative" or "general?" They're very different statements.

CharlieDigital · a year ago

"General" is typically referred to AGI[0].

"Generative" is typically referred to as gen AI.

This is pretty standard nomenclature.

[0] https://en.wikipedia.org/wiki/Artificial_general_intelligenc...

Vox_Leone · a year ago

I agree. While GPUs have been indispensable for training and deploying large language models, their role in the field of computer vision is becoming increasingly critical. The expanding applications, increasing complexity of models, real-time processing needs, growing dataset sizes, and other factors, seem to indicate that the demand for GPUs is likely to rise in the mid term.

CharlieDigital · a year ago

I think if you take a device like the Apple Vision Pro or Meta Quest: imagine that you can procedurally generate any experience like a holodeck from just a few instructions and fine tuning the generated world. We're not even close.

Imagine YouTube except you can fully generate shorts. Realistic, 2D animated, 3D, whatever your imagination desires. Imagine how that changes storytelling and content creation.

More GPUs, please.

ActorNightly · a year ago

The interesting thing is that at some point, its very possible that LLM (or rather just transformer based large models) are going to be used to create ASIC or at least used to program much cheaper FPGAs for the specific applications, so in fact, GPU use will still decline.

janwas · a year ago

I acknowledge Jevon and that demand may increase. But the original point was actually that specialized HW will be outcompeted by newer algorithms on more flexible HW. I have seen this play out several times, for example with brute force Sum of Abs Difference instructions that no video codec uses anymore, because a sparse search is faster than brute force HW. Personal opinion: this will happen again. Why should we focus on how to fill GPU cycles, rather than invent the future of how to do better?

daveguy · a year ago

The premise of this whole article is that once general purpose computing can do what the GPUs can then demand for GPUs will drop. That is a fundamentally flawed assumption. The ability to parallelize operations using a GPU will always be available and GPU development will continue. Hardware tech (process nodes, etc) that improves CPUs will also improve GPUs. Maybe we will reach a peak demand, but not until individual GPUs and CPUs are in millisecond token inference range. And that won't happen for a long time. The author is erroneously conflating GPU and ASIC development.

To be clear, I agree that LLMs are not anywhere close to AGI and I don't think they ever will be (just a component). But that doesn't mean they aren't useful enough to chew up a lot of compute for the foreseeable future.

danjl · a year ago

The reasons GPUs are useful is that they allow us to stay closer Moore's curve. Power is the problem, not AI algorithms. AI algorithms are a tensor processing best case right now. It took us decades to develop widely parallel algorithms for graphics rendering. AI will develop new algorithms. And algorithms will be optimized to be more sparse, which will make them less of an ideal case for the widely parallel hardware. But we will likely need both serial and widely parallel cores to handle any future algorithm.

HarHarVeryFunny · a year ago

> To be clear, I agree that LLMs are not anywhere close to AGI and I don't think they ever will be (just a component). But that doesn't mean they aren't useful enough to chew up a lot of compute for the foreseeable future.

Sure, but the question is how much compute? What if we're reaching the asymptotic limit of scaling (but still with some post-training and dataset curation gains to be had), and existing datacenters go from being used for training to inference instead. How long before another data center needs to be built or updated with latest NVIDIA GPUs? Is this (LLM-based AI) just a GPU-upgrade market, or still one growing explosively ?

janwas · a year ago

Brute force hardware locks us into the local minimum of matmul. Stella Nera demonstrated better power efficiency than GPUs, using approx matmul. Maybe the next generation of special purpose HW can help, too. But they come after 2-4 years. Who knows what we will want then? I'd rather have flexible HW.

mensetmanusman · a year ago

You will need 10-100x the number of GPUs to get video working. If GPUs crash in price, video will take off and then GPUs will be scarce again.

devinprater · a year ago

I think of it like speech synthesizers. First they were their own machines, then cards you plug into a computer, then once people figured out how to mash human speech together, they were, in some cases, a good 1.5 GB. Now, Siri voices, which are tons better than the concatinative models, used with the VoiceOver screen reader are a good 70 MB, Google TTS, even though it's awful and laggy with TalkBack, offline voices are a good 30 MB for a language pack, and in iOS 18, we can use our own voices as VoiceOver voices. So I think eventually we'll figure out how to run amazing AI stuff, even better than today, on our devices. And I think tons more people are working on LLM's than were ever working on TTS systems.

bluGill · a year ago

GPUs have been on most computers for decades now. Vector operations have long been known as useful for a lot of different tasks. Many of those tasks are long running enough that shunting them off to a different core as long made sense. Thus GPUs have been in everything and manufactures of computers have long been trying to figure out how to use those GPUs for workloads that don't need the full power for graphics. For some tasks GPUs are better, for others CPUs with vector operations are better. There is enough room for both on modern computers and this doesn't look to change.