TPU transformation: A look back at 10 years of our AI-specialized chips

It's crazy that Google doesn't spin-out their TPU work as a separate company.

TPUs are the second most widely used environment for training after Nvidia. It's the only environment that people build optimized kernels for outside CUDA.

If it was separate to Google then there a bunch of companies who would happily spend some money on a real, working NVidia alternative.

It might be profitable from day one, and it surely would gain substantial market capitalization - Alphabet shareholders should be agitating for this!

aseipp · a year ago

People constantly bring this point up every 2 weeks here, the cost competitiveness of TPUs for Google comes exactly from the fact they make them in house and don't sell them. They don't need sales channels, support, leads, any of that stuff. They can design for exactly one software stack, one hardware stack, and one set of staff. You cannot just magically spin up a billion-dollar hardware company overnight with software, customers, sales channels and support, etc.

Nvidia has spent 20 years on this which is why they're good at it.

> If it was separate to Google then there a bunch of companies who would happily spend some money on a real, working NVidia alternative.

Unfortunately, most people really don't care about Nvidia alternatives, actually -- they care about price, above all else. People will say they want Nvidia alternatives and support them, then go back to buying Nvidia the moment the price goes down. Which is fine, to be clear, but this is not the outcome people often allude to.

authorfly · a year ago

You can or at least historically could buy access to TPUs and request it for non-profit projects too through the TPU research programme. Certainly you have been able to pay for pro membership on Notebook to get TPU access, which is how many of the AI generation before ChatGPT learned to run AI. TPUs however were kind of always for training, never geared for inference.

nl · a year ago

> You cannot just magically spin up a billion-dollar hardware company overnight with software, customers, sales channels and support, etc.

Not saying it is easy or to do it magically.

Just noting that Groq (founded by the TPU creator) did exactly this.

daghamm · a year ago

Actually, the do sell them. Only the low power edge versions, but still.

jankeymeulen · a year ago

The TPUs are highly integrated with the rest of the internal Google ecosystem, both hardware and software. Untangling that would be ... interesting.

michaelt · a year ago

We have a perfectly reasonable blueprint for an ML accelerator that isn't tied into the google ecosystem: nvidia's entire product line.

Between that and the fact Google already sells "Coral Edge TPUs" [1] I'd think they could manage to untangle things.

Whether the employees would want to be spun off or not is a different matter, of course...

[1] https://coral.ai/products/

hengheng · a year ago

Knowing what I know about big corporations, the biggest entanglement is going to be IP ownership, political constraints and promises to shareholders.

qwertox · a year ago

There would probably a huge demand, but would Google be able to satisfy it? Is it currently able to satisfy its own demand?

credit_guy · a year ago

That would be the point of spinning it out. They could have an IPO, raise as much capital as there is in the observable Universe, and build enough fabs to satisfy all the demand.

bushbaba · a year ago

> It's crazy that Google doesn't spin-out their TPU work as a separate company.

Not really. Google TPUs require google's specific infrastructure, and cannot be deployed out side the Google Datacenter. The software is google specific, the monetization model is google specific.

We also have no idea how profitable TPUs would actually be if a separate company. The only customer of TPUs is Google and Google Cloud.

theptip · a year ago

Why would you spin out a competitive moat?

monkeydust · a year ago

Any activist investors lurking in here?

The real winner here is the marketing department who manage to make this article a "celebration of successes" when in fact we know the TPU is yet one more of those biggest failures of Google to have the lead by a mile and then.. squander it. And no, "it's on our cloud and Pixel phones" doesn't cut it at this level.

visarga · a year ago

I have a strong suspicion that previous generations of TPU were not cost effective for decent AI, explaining Google's reluctance to release complex models. They have had superior translation for years, for example. But scaling it up to the world population? Not possible with TPUs.

It was OpenAI that showed you can actually deploy a large model, like GPT-4, to a large audience. Maybe Google didn't reach the cost efficiency with just internal use that NVIDIA does.

orbat · a year ago

Google used to have superior translation but that hasn't been the case for years now. Based on my experience DeepL (https://www.deepl.com/) is vastly superior, especially for even slightly more niche languages. I'm a native Finnish speaker and I regularly use DeepL to translate Finnish into English in cases where I don't want to do it by hand, and the quality is just way beyond anything Google can do. I've had similar experiences with languages I'm less proficient with but still do understand to an extent, such as French or German

throwawaymaths · a year ago

there are several talks out there where Google soft-admits that at least the early gens of TPUs really sucked, e.g.:

https://www.youtube.com/watch?v=nR74lBO5M3s

(note the lede on the TPU is buried pretty deep here)

throwaway48476 · a year ago

I suspect it had much more to do with lacking product market fit. They spent 10 years faking demos and dreaming about what they thought AI could do eventually but since it never worked the products never released and so they never expanded. A well optimized TPU will always beat a well optimized GPU on efficiency.

Deleted Comment