Readit News logoReadit News
deepnotderp · 2 years ago
I’m not a fan of Extropic, but I’m seeing a lot of misconceptions here.

They’re not building “a better rng”- they’re building a way to bake probabilistic models into hardware and then run inference on them using random fluctuations. Theoretically this means much faster inference for things like PGMs.

See here for similar things: https://arxiv.org/abs/2108.09836

There’s a company called Normal Computing that did something similar: https://blog.normalcomputing.ai/posts/2023-11-09-thermodynam...

winwang · 2 years ago
Skimmed the litepaper. Has the flavor of: you can do "simulated" annealing by literally annealing. I like the idea of using raw physics as a "hardware" accelerator, i.e. analog computing. fwiw, quantum computing can be seen as a form of analog computing.

I do think that a "better rng" can be interesting and useful in and of itself.

Thanks for the Normal Computing post, it felt more substantial.

pclmulqdq · 2 years ago
I make a better RNG right now (https://arbitrand.com).

We experimented with doing ML training with it, but it's not clear that it trains any better than a non-broken PRNG. It might be fun to feed the output into stable diffusion and see how cool the pictures are, though.

apognwsi · 2 years ago
with error correction, qc is entirely distinct from analog computing. that is what makes it even remotely viable, theoretically.
lumost · 2 years ago
It did make me curious however, if we dropped the requirement that operations return correct values in favor of probably correct values - would we see any material computing gains in hardware? Large neural models are intrinsically error correcting and stochastic.

I’m unfortunately not familiar enough with hardware to weigh in.

IshKebab · 2 years ago
The trouble is if you use actual randomness then you lose repeatability which is an incredibly useful property of computers. Have fun debugging that!

What you want is low precision with stochastic rounding. Graphcore's IPUs have that and it's a really great feature. It lets you use really low precision number formats but effectively "dithers" the error. Same thing as dithering images or noise shaping audio.

throwawaymaths · 2 years ago
So it sounds like this startup is explicitly not using foundation models?

Is there any evidence that such a probabilistic model can run better than a state of the art model?

Or alternatively what would it take to convert an existing model (let's say, an easy one like llama2-7b) into an extropic model?

p1esk · 2 years ago
Is there any evidence that such a probabilistic model can run better than a state of the art model?

No, but they got 15M seed funding anyway.

autonomousErwin · 2 years ago
I wouldn't want to write this off because you get the feeling these guys are on to something that could be hugely important (ignoring quantum this thermodynamic that) - but surely it feels like they need to get to the point a lot faster e.g.

"We're taking a new approach to building chips for AI because transistors can't get any smaller."

I really don't know what they gain by convoluting the point and it's pretty hard to follow what the CEO is talking about half the time.

pclmulqdq · 2 years ago
Quantum computing people have been selling this exact spiel (including the convoluted talking points) for decades and it keeps working at getting funded. It has not produced any results for the rest of us, though.
duped · 2 years ago
One difference is that baking mathematical models into electronic analogs is older than integrated circuits. The reason we deviated from that model is because the re-programmability and cost of general purpose, digital computers was way more economical than bespoke hardware for expensive and temperamental single purpose analog computers. The unit economics basically killed analog computing. What Extropic (and others) have identified is that in the case of machine learning, the pendulum might have to swing back because we do have a large scale need for bespoke hardware. We'll see if they're right.

Quantum computing has been exploring an entirely new model of computation for which it's hard to even articulate the problems it can solve. Whereas using analog computers in place of digital is already well defined.

pillusmany · 2 years ago
Neither has fusion research produced anything for us yet. Should we stop funding it?
_sword · 2 years ago
The tech could be really cool if e.g. classifiers could be represented within the probability space modeled on their hardware. However their shaman-speak isn't confidence inducing.
zoogeny · 2 years ago
Your summary seems to miss a later quote from the article:

> Extropic is also building semiconductor devices that operate at room temperature to extend our reach to a larger market. These devices trade the Josephson junction for the transistor. Doing so sacrifices some energy efficiency compared to superconducting devices. In exchange, it allows one to build them using standard manufacturing processes and supply chains, unlocking massive scale.

So, their mass-market device is going to be based on transistors.

The actual article read like a weird mesh of techno-babble and startup-evangelism to me. I can't judge if what they are suggesting is vaporware or hyperbole. This is one of those cases where they are either way ahead of my own thinking or they are trying to bamboozle me with jargon.

I personally find it hard to categorize a lot of AI hype into "worth actually looking into" vs. "total waste of time". The best I can do in this case is suspend my judgement and if they come up again with something more substantive than a rambling post then I can always readjust.

schiffern · 2 years ago
> trying to bamboozle me with jargon

Am I the only one who thought the article was clear, lucid, and reasonably concise?

The company's success or failure will depend on execution, but the value proposition is quite sound. Maybe I've just spent too much time in the intersection between information theory, thermodynamics, and signal processing...

"Don't splurge on high SNR ('digital') hardware just to re-introduce noise later." == "Don't dig a hole and fill it in again. You waste energy twice!"

semi-extrinsic · 2 years ago
> Doing so sacrifices some energy efficiency compared to superconducting devices.

In most applications superconductivity does not actually yield better energy efficiency at system level, since it turns out cooling stuff to negative several hundred degrees is quite energy demanding.

riwsky · 2 years ago
Convolutional neural networks were a huge advancement in their time
autonomousErwin · 2 years ago
I don't disagree. I just come away from the article feeling more confused as opposed to enlightened and excited about what they're building.

It even makes me think that they don't understand what they're talking about which is why they're using complicated terminology to mask it but I'm hopeful I'm wrong and this is an engineering innovation that benefits everyone.

vipshek · 2 years ago
I have no idea about the merits of this approach, but I found this interview with the founders a lot more sensical than the linked article:

https://twitter.com/Extropic_AI/status/1767203839818781085

blueblimp · 2 years ago
This was definitely easier to follow.

Since they're building a special-purpose accelerator for a certain class of models, what I'd like to see is some evidence that those models can achieve competitive performance (once the hardware is mature). Namely, simulate these models on conventional hardware to determine how effective they are, then estimate what the cost would be to run the same model on Extropic's future hardware.

Eliezer · 2 years ago
Ah, but running an experiment like that risks it returning an answer you don't like.
huevosabio · 2 years ago
Much, much better. The first minute or so explains what they are trying to do and why in a way the I can understand.

This interview makes me much more excited and less skeptic than Verdon's usual mumbo-jumbo jargon. He should try using simpler, and more humble language more often.

blovescoffee · 2 years ago
This interview makes their product seem like BS. First, they literally cannot simply explain the problem or solution. Regardless, their pitch is that they're building a more power efficient probability distribution sampler. No one in AI research thinks that's a bottleneck.

edit: btw the bottleneck in AI algos is matrix multiply and memory bandwith.

HarHarVeryFunny · 2 years ago
My take on the Garry Tan interview (which seems pretty clear, regardless of whether this is snake oil or not) is that Extropic are building low power analog chips because we're hitting up against the limits of Moore's Law (limit's of physics in reducing transistor size), and at the same time the power consumption for LLM/AI training and inference is starting to get out of hand.

So, their solution is to embrace the stochastic operation of smaller chip geometries where transistors become unreliable, and double down on it by running the chips at low power where the stochasticity is even worse. They are using an analog chip design/architecture of some sort (presumably some sort of matmul equivalent?) and using a "full-stack" design whereby they have custom software to run neural nets on their chips, taking advantage of the fact the neural nets can tolerate, and utilize, randomness.

ninjin · 2 years ago
Computationally, yes, those are the bottlenecks. But I would also add supervised training data, as we can never get enough of that and it is one of few things that increases in compute are (to my mind, you could argue that by scaling unsupervised training further we could do away with it, but I am not yet convinced) not able to solve.
duped · 2 years ago
My understanding is that the goal of these approaches are to avoid those bottlenecks.
throwawaymaths · 2 years ago
Vaguely though what they are talking about sounds like it might be better for training? (I'm really stretching it here)
jason-phillips · 2 years ago
And Lex's podcast/interview with Guillaume Verdon, one of said founders.

https://m.youtube.com/watch?v=8fEEbKJoNbU&pp=ygUVbGV4IGZyaWR...

throwawaymaths · 2 years ago
Anyone else get super creepy vibes from the way he talks in this video? I'm calling that it's a fraud.

If it is a fraud, how do people like this get funded?? (And how can I be creepier so that my real ideas get funded)

ein0p · 2 years ago
People need to read Hamming’s old papers in which he very clearly explains why analog circuits are not viable at scale. This is also why the brain uses spikes rather than continuous signals. The issue is noise, interference, and attenuation. There’s no way to get around this. If they have invented a way, I’d like to see it. But until it’s demonstrated, I’d take such things with a large grain of salt.
Animats · 2 years ago
You can re-quantize analog signals into a finite number of levels to prevent noise accumulation. That's how TLC (8 levels) and QLC (16 levels) flash memory cells work. The cells store an analog value, but it's forced to a value close to one of N discrete values. The same approach is used in modems.

Deep learning doesn't seem to need that much numerical precision. People started with 32-bit floats, then 16-bit floats, now sometimes 8-bit floats, and recently there are people talking up 2-bit trinary. The number of levels needed may not be too much for analog. If you have a regenerator once in a while to slot values back to the allowed discrete levels, you can clean up the noise. That's an analog to digital to analog conversion, of course.

That's not what these guys are talking about, as far as I can tell.

twobitshifter · 2 years ago
analog circuits are making a comeback because they are great for simulating the equations of the physical world more efficiently than a digital approach. https://spectrum.ieee.org/not-your-fathers-analog-computer
sfnrm · 2 years ago
Sounds interesting. Do you have a link? (or at least a title?)
ein0p · 2 years ago
Not at the moment, but I do recall he has a chapter on this in his book “The Art of Doing Science and Engineering”, which I also recommend. He uses very long transmission lines to explain this, but the same thing applies at the nano scale, and perhaps to an even greater extent due to the much noisier environment and higher frequencies.
binoct · 2 years ago
I really hope this was an experiment in using gen AI:

“Create a website for a new company that is building the next generation of computing hardware to power AI software. Make sure it sounds science-y but don’t be too specific.”

swalsh · 2 years ago
Why make such a low effort pessimistic comment. What happened to HN?
adw · 2 years ago
HN has always been a tense standoff between a few cliques, the first two being the ostensibly intended audience;

* competent and curious engineers

* entrepreneurs, who live on a continuum where one end is...

* ...hucksters and snake-oil purveyors, of which there are plenty, and

* (because this is the Internet) conspiracy theorists and other such loons

and recently

* political provocateurs

You can make a thread work (for that group of people) if it self-selects who reads it. Unfortunately, AI is catnip to all five of these groups, so the average thread quality is exceptionally low – it serves all five groups badly.

Whether some of these people _should_ be served well is a separate question.

catchnear4321 · 2 years ago
same thing that is happening everywhere. cognitive effort is getting unbalanced, it used to be necessary to put effort in on write, and on read.

now, it is hard to tell who put effort in at all. read or write.

would you consider your own response to be optimistic or high effort?

g8oz · 2 years ago
Snark has always been part of this website.
ks2048 · 2 years ago
I think pointing out BS is an important part of a useful forum.
amusedcyclist · 2 years ago
^ How to spot a sucker

Dead Comment

adfm · 2 years ago
The use of “full-stack” was the first thing I noticed. Everyone, please stop using that term. I’m pretty sure, with a high degree of certainty, you don’t know what it means. If you do, there’s a merit badge waiting for you. And can we please stop using “hallucinations” to describe output. Yes, it may look like your tool dropped acid, but that’s not what it is.
dekhn · 2 years ago
Rob Pike once said that he was "full-stack": when he worked on Voyager, he understood the system from quantum mechanics to flight software (https://hachyderm.io/@robpike/109763603394772405)
catchnear4321 · 2 years ago
full-stack means the ic can take any ticket. do the details beyond that matter?
pclmulqdq · 2 years ago
I now think of the "stack" of a modern business as starting with physics and ending with making someone happy (unless you are Oracle). Full-stack engineers should then know how to connect physics to peoples' happiness.
Nevermark · 2 years ago
> can we please stop using “hallucinations” to describe output.

Right. A better word is confabulation.

I.e. pseudomemories, a replacement of a gap in information with false information that is not recognized as such.

Dead Comment

thatguysaguy · 2 years ago
Uninmportant, but if you're citing Moore's paper I feel like you're just trying to pad out the references to make it look like you're serious
gitfan86 · 2 years ago
At a high level it is the right answer to the data center electricity demand problem. Which is that we need to make AI hardware more efficient.

Pragmatically, it doesn't make much sense given that it would take years for this approach to have any real work use cases in a best case scenario. It seems way more likey that efficiency gains in digital chips will happen first making these chips less economically valuable.

kneel · 2 years ago
This guy spends an extraordinary amount of time posting memes and e/acc silliness.

So much so I wonder what the hell they're doing with this company. Is he a prolific poster and an engineering genius? Or is he just another poster

Bjorkbat · 2 years ago
For the longest time I thought the person behind the account was just some random guy who was probably very into crypto and decided to dabble in AI because of the parallels between e/acc and the whole "to the moon" messaging you find in crypto communities.

Never would have guessed the guy was an actual physicist

trzy · 2 years ago
Hard time believing this is legit given how much time the CEO spends goofing around on social media. If it were possible to short startups, this would be a top candidate.
jp42 · 2 years ago
honestly, it would be too early to say this. Considering the people who invested in this startup, its better to assume CEO is capable. If he is not able to deliver in reasonable timeline then, we all are free to blame him for posting things on SM. actually many knows his company because he is goofing around on SM especially e/acc stuff.
danielmarkbruce · 2 years ago
It's more interesting to see who passed on it. There isn't a single top tier VC here.

This whole pitch sounds like the usual quantum computing babble.