Comparing the 1970's Cray-1 supercomputer against the Raspberry Pi

Am I the only one that is shocked by the fact that a 1978 computer, even if a supercomputer (but still using the technology of the time) was 1/4 the speed of a Raspberry? The Pi, if you look at the big picture of computing, is a very fast computer. For comparison: you can run a 1 billion parameters LLM on a Raspberry pi at decent speed. This means that the Cray could run it, even if slowly. That's incredible.

throwbadubadu · 2 years ago

Find it similarly amazing, yes.

> This means that the Cray could run it

Gladly, we had better things to do than that :D

But seriously, while we could have run it maybe speedwise, it definitely lacked the memory, not? And if one tried to train it he wouldn't be finished today. But would make a fun backwards sci-fi story imagining a time traveller that brought the 80ies an LLM from today, what would the world say and do with that slow oracle?

dekhn · 2 years ago

I can't find it now but I wrote a bit of fanfic where Ken Thompson sent an LLM back in time (referencing his "Love, Ken" UNIX tapes that he would send out) to save humanity. He was always a bit head of his time.

samstave · 2 years ago

May you please do me (and others) a favor:

What value does an LLM hold intrinsically.

Lets say "brought an LLM from today"

Does that mean just a multi gig file? What is INSIDE the LLM that would be of value? How does one speak to an LLM WRT 80's tech, and what could one glean from it....

ELI5 an LLM;

BARD: https://i.imgur.com/ahRVECz.png

OpenAI: https://i.imgur.com/Rbk5BD6.png

Bing: https://i.imgur.com/zVJ1tu6.png

So, how would one explain 80s folks what even an LLM is when we cant even ELI5 2024?

mrb · 2 years ago

"1/4 the speed of a Pi" applies to the original (slow) 2012 Pi which is unable to run LLM as fast as you think. However the 2020 Pi 400 (equivalent to Pi 4), which can run the LLM workload, is about 100 times faster than the Cray 1:

"Raspberry Pi ARM CPUs - The comment above was for the 2012 Pi 1. In 2020, the Pi 400 average Livermore Loops, Linpack and Whetstone MFLOPS reached 78.8, 49.5 and 95.5 times faster than the Cray 1." http://www.roylongbottom.org.uk/Cray%201%20Supercomputer%20P...

A Pi 4 can infer ~0.8 tokens/sec with some of the more optimized configs (as per https://www.dfrobot.com/blog-13498.html). So the Cray would have needed ~2 minutes per token, so ~2.5 hours to generate one sentence... if hypothetically it had enough RAM (it didn't).

In 1978 RAM cost about $25k per megabyte (https://jcmit.net/memoryprice.htm). Assuming you needed 4GB for inference, RAM would have cost $100M in 1978 dollars, or $470M in today's dollars.

For comparison, the Cray cost $7M in 1978 which is $32M in today's dollars. So once you buy a Cray you would have had to spend 14 times that amount on building a custom RAM device extension of 4GB, somehow hooked to the Cray, to finally be able to generate one sentence every 2.5 hours...

But in 1978, even if RAM was available to do LLM inference, it would have been impossible to train the model, as vastly more compute power is needed than for inference.

thoughtsimple · 2 years ago

That was on a 700 MHz Raspberry Pi 1. On an 1800 MHz Raspberry Pi 400 NEON SIMD the difference was another order of magnitude.

[QUOTE] Comparison - The three 700 MHz Pi 1 main measurements (Loops, Linpack and Whetstone) were 55, 42 and 94 MFLOPS, with the four gains over Cray 1 being 8.8 times for MHz and 4.6, 1.6, 15.7 times for MFLOPS.

The 2020 1800 MHz Pi 400 provided 819, 1147 and 498 MFLOPS, with MHz speed gains of 23 times and 69, 42 and 83 times for MFLOPS. With more advanced SIMD options, the 64 bit compilation produced Cray 1 MFLOPS gains of 78.8, 49.5 and 95.5 times.[/QUOTE]

d_sem · 2 years ago

Remarkable, yes. Shocking, no. Exponential growth was something experience in the computer industry for decades, and people where quite normalized to it.

klelatti · 2 years ago

Would we have been able to train the LLM in the first place though? Guessing that that would have been completely infeasible?

samstave · 2 years ago

This guy Time Travels. (check his hands, he likely has extra fingers)

But... lets look at the availability of DATA in the 80s..

Frankly, this is how hacking/phreaking was invented.

Dumpster-diving for line-printer discards in dumpsters to understand what their systems did.

(This is an actual story; people were bin dipping (at&t?) dumpsters and finding exploits (social or electronic) in the discarded line-printer outputs....

Can someone validate that comment?

bombcar · 2 years ago

The supercomputers of the time were very heavily designed to run floating-point operations (IIRC) and so while the FPU performance might be comparable, I'm not sure a Cray could be used as a "1/4 speed Pi" for general computing things like running Linux.

criddell · 2 years ago

I used a Cray around 1998 (from the Pittsburgh Supercomputing Center IIRC) and it was super fast on very particular tasks. Specifically, there was some type of processing pipeline that once you had it set up, it would produce a stream of calculations very quickly.

I wonder if the Raspberry Pi is faster on all tasks, or is there some type of computation the old Cray is still competitive?

kevin_thibedeau · 2 years ago

The shocking thing is that every contemporary PC and handheld device would place on the TOP500 list in the 90's yet they're still burdened with slow software when doing basic operations.

rightbyte · 2 years ago

Ye my fastest computer I will ever own was a Pentium 200MHz with 32Mb of ram running Windows 95.

JeffSnazz · 2 years ago

Given that it also functioned as sort of an uncomfortable couch—not really.

Besides, it's way further behind in basically every respect but compute.

GartzenDeHaes · 2 years ago

The other factor is RAM, which is more problematic. The Cray-1 had up to 4 Meg WORDS RAM, or 2 Meg as we would measure today (I think).

otabdeveloper4 · 2 years ago

Cray was 64 bit, IIRC, so 4 megawords would be 32 megabytes.

This is actually mildly surprising to me that the Raspberry Pi is only 4.5x faster. I would have bet 10-20x faster just because of how much time has passed and all the talk about: "your cellphone is 1000x faster than the Apollo computers" that I've been hearing since the time of the t-mobile sidekick.

I've always taken them with a grain of salt, but even if they were only an order of magnitude off, a Pi is loads faster than a sidekick. And sure the Cray is loads faster than the Apollo computers, but I wouldn't have thought it was THAT much faster.

I am amazed.

snek_case · 2 years ago

The Cray 1 was way ahead of its time. It implemented SIMD vector instructions to speed up numerical computation, was liquid cooled, and ran at 80 MHz (in 1976). The chief architect, Seymour Cray, was kind of a genius, and is responsible for designing many other pioneering machines.

Also, fun fact, it didn't have a CPU. It used all discrete logic chips, and was wired by hand, with lots and lots of wire. IIRC Seymour Cray liked to hire women to do the wiring job, because they had an easier time fitting inside the computer core to wire it, and doing detailed work because they had smaller hands.

msla · 2 years ago

> Also, fun fact, it didn't have a CPU. It used all discrete logic chips, and was wired by hand,

That's how CPUs were built back then. What it didn't have was a single-chip CPU, or a microprocessor.

Next up: "The Model T Ford didn't have an engine, it had this gasoline-burning device to provide motive power."

drfuchs · 2 years ago

Lots and lots of wire, yes, with each one cut to a specific length so that all of each gate's input signals arrive at the same time. That's why the backplane looks like a mess of wires hanging down due to seemingly being way too long.

peterfirefly · 2 years ago

> SIMD vector instructions

Err...

That term is used to describe modern vector instructions that are NOT vector instructions in the classical Cray-sense. Modern CPUs use wide registers that can be regarded as vectors of 2/4/8/... values. Cray used memory-to-memory variable-length vectors.

I'm not saying that "SIMD vector instructions" is an incorrect description of what the Cray machines did, I'm saying that the term usually means something else today.

heelix · 2 years ago

The fluorinert coolant was really neat stuff too. Inert, so you could submerge electronics in it to cool them. I was able to find recycled coolant and used that to do the liquid cooled build for my dual celeron system where coolant flowed over the board/cpu. Knowing what I do now... I should have been far less cavalier with the stuff.

whizzter · 2 years ago

And that kinda makes it unfair to the Raspberry, maybe not the RPI1 but I'm pretty sure that the RPi4 (rated at 78x the Cray) running code in compute-mode would thrash the Cray 1 even harder (yes, not entirely comparable but since you probably didn't get full throughput on the Cray with regular C code then using the GPU on the RPi should be fair game).

tass · 2 years ago

You’re not wrong, since the 4.5x number comes from a comparison with the rPi 1.

The rPi 4 is over 10x faster than the 1.

The original article points out that the raspberry pi 400, with the benchmark targeting 64 bits, was about 80x Cray for the same benchmark which was 4.5x for the rPi 1.

Raspberry Pi 5 advertises a 2-3x performance gain over earlier versions, though this wasn’t benchmarked in these tests.

postmodest · 2 years ago

Aren't these also in FLOPS? The rpi isn't exactly a floating-point monster. Hell, until the Pentium, Intel CPU's weren't guaranteed to even have an FPU.

As to ARM in general, here's a post about how ARM-1 compared to the 387: https://retrocomputing.stackexchange.com/questions/24826/did...

robin_reala · 2 years ago

The Cray-1 was ~160,000,000 FLOPS. The Apollo Guidance Computer was about 14,000 FLOPS, so about 11,500× slower.

jermaustin1 · 2 years ago

I guess it was THAT much faster. That is an insane leap in performance in a decade. I know they are apples and oranges. One weighed about 150x more than the other, but still such a surprise to me.

bee_rider · 2 years ago

It is against a Pi 1.

It looks like the Cray gets a bit of a boost from the linpack scores (the pi is only 1.6x faster!), which is a good test for the Cray (understatement!).

flavius29663 · 2 years ago

Apllo was not the fastest computer in the world. It was a small device enough to shuttle rockets around the moon and back.