Readit News logoReadit News
klelatti · 2 years ago
Nothing against AdaFruit, but this really should link to Roy Longbottom's detailed comparisons [1] that this very short post links to.

[1] http://www.roylongbottom.org.uk/Cray%201%20Supercomputer%20P...

pge · 2 years ago
And the HN discussion of that link - https://news.ycombinator.com/item?id=38758355
klelatti · 2 years ago
Oh wow! Thanks, that's a great discussion.
dang · 2 years ago
Thanks! Macroexpanded:

Cray-1 vs Raspberry Pi - https://news.ycombinator.com/item?id=38758355 - Dec 2023 (180 comments)

Fnoord · 2 years ago
The article mentions "See Longbottom’s extensive tests and comparisons article here." and [1]. This was already mentioned in a snapshot of 18 Jan 2024 [2] so it wasn't added after your criticism.

[1] http://www.roylongbottom.org.uk/Cray%201%20Supercomputer%20P...

[2] https://archive.is/a99i3

klelatti · 2 years ago
No criticism of AdaFruit intended. I meant the HN link should be to the original article - the post that I say the AdaFruit post links to.

Deleted Comment

jacquesm · 2 years ago
They meant the link at the top here.
milliams · 2 years ago
Specifically, the Raspberry Pi section at http://www.roylongbottom.org.uk/Cray%201%20Supercomputer%20P...
jacquesm · 2 years ago
Indeed, it's a reblog of a reblog of an absolutely excellent article.
dahart · 2 years ago
Another really fun Cray comparison: Turner Whitted (“father of ray tracing”) is rumored to have speculated some time back when he first published on ray tracing that in order to do real time ray tracing, you could put one Cray supercomputer per pixel out in the desert, each one with a single colored light, and view it from an airplane, and that would be roughly enough compute to achieve real time.

A 4090 today is roughly 500,000 times faster, which means we now have achieved one Cray per pixel (!) for an 800x600 image (smaller than images today, but maybe a bit larger than the average image size in the late 70s).

RicoElectrico · 2 years ago
To be honest our real-time RT works only because we're "cheating" with denoisers working in spatial and temporal domain.
dahart · 2 years ago
In a way, yes, partly because we’re rendering typically larger images now (1080p is 4x larger than 800x600), but also because the image quality standards are very high, and because scenes are enormous compared to a pair of spheres and a checkerboard plane with a single light, because games don’t get to spend the entire frame budget on rays, and because we’re using a lot more stochastic sampling now using path tracing than what Turner Whitted did. We do get the 500,000x relative to Cray, and denoising effectively adds another x-factor on top of that, and that’s what it takes to get today’s games up to passable real time ray tracing. On the other hand, genuine real-time ray tracing without denoising has been possible for a long time for toy scenes outside of games, it all depends on what goal posts we’re talking about exactly, right?
corysama · 2 years ago
Sure, to get Cyberpunk 2077 at 4K 60FPS, we cheat like hell. But, a scene from 70s ray tracing research at 800x600x24 fps on a 4090 doesn’t need to cheat much.

Deleted Comment

dekhn · 2 years ago
I have not... heard that idea before. Instead, I heard "reality is just 100 million polygons per second" (Jim Clark?), implying that if you can do ray tracing at a high resolution and frame rate, you can fool somebody's optical nerves into thinking they are looking at reality (ignoring the difference between pixel screens and the physics of how human vision actually works).

Does anybody have a reliable link to the Whitted quote?

dahart · 2 years ago
My source for the story is indirect, via Steve Parker, who’s worked with Turner.

https://www.youtube.com/live/LUFp6sjKbkE?si=8vcxo-Vp8oeRUnob

Scrub to 3:52:19 for the Turner Whitted story.

BTW, mostly unrelated, but scrub that video to 5:47:10 for an amazing talk by Ivan Sutherland (“father of computer graphics”) that is not about graphics (he politely refuses to talk about graphics anymore :P), but about the active research he’s been doing (at 85 years old) into Single Quantum Flux circuits (an alternative to CMOS).

Also- Jim Clark at 4:48:40

Jim’s idea you mention is still correct & compatible with Turner’s idea. Jim’s point is that we only need to render finite pixels. You might need an x-factor more triangles than pixels because of sampling and depth complexity and secondary lighting, so 100M polys is probably in the ball park, as long as we can quickly pick the right 100M polys in real time…

In a way, Turner was talking about a lower bound, while Jim is talking about an upper bound, albeit slightly different things but they are similar, both relate to how much compute is needed for real time rendering.

antirez · 2 years ago
Am I the only one that is shocked by the fact that a 1978 computer, even if a supercomputer (but still using the technology of the time) was 1/4 the speed of a Raspberry? The Pi, if you look at the big picture of computing, is a very fast computer. For comparison: you can run a 1 billion parameters LLM on a Raspberry pi at decent speed. This means that the Cray could run it, even if slowly. That's incredible.
throwbadubadu · 2 years ago
Find it similarly amazing, yes.

> This means that the Cray could run it

Gladly, we had better things to do than that :D

But seriously, while we could have run it maybe speedwise, it definitely lacked the memory, not? And if one tried to train it he wouldn't be finished today. But would make a fun backwards sci-fi story imagining a time traveller that brought the 80ies an LLM from today, what would the world say and do with that slow oracle?

dekhn · 2 years ago
I can't find it now but I wrote a bit of fanfic where Ken Thompson sent an LLM back in time (referencing his "Love, Ken" UNIX tapes that he would send out) to save humanity. He was always a bit head of his time.
samstave · 2 years ago
May you please do me (and others) a favor:

What value does an LLM hold intrinsically.

Lets say "brought an LLM from today"

Does that mean just a multi gig file? What is INSIDE the LLM that would be of value? How does one speak to an LLM WRT 80's tech, and what could one glean from it....

ELI5 an LLM;

BARD: https://i.imgur.com/ahRVECz.png

OpenAI: https://i.imgur.com/Rbk5BD6.png

Bing: https://i.imgur.com/zVJ1tu6.png

--

So, how would one explain 80s folks what even an LLM is when we cant even ELI5 2024?

mrb · 2 years ago
"1/4 the speed of a Pi" applies to the original (slow) 2012 Pi which is unable to run LLM as fast as you think. However the 2020 Pi 400 (equivalent to Pi 4), which can run the LLM workload, is about 100 times faster than the Cray 1:

"Raspberry Pi ARM CPUs - The comment above was for the 2012 Pi 1. In 2020, the Pi 400 average Livermore Loops, Linpack and Whetstone MFLOPS reached 78.8, 49.5 and 95.5 times faster than the Cray 1." http://www.roylongbottom.org.uk/Cray%201%20Supercomputer%20P...

A Pi 4 can infer ~0.8 tokens/sec with some of the more optimized configs (as per https://www.dfrobot.com/blog-13498.html). So the Cray would have needed ~2 minutes per token, so ~2.5 hours to generate one sentence... if hypothetically it had enough RAM (it didn't).

In 1978 RAM cost about $25k per megabyte (https://jcmit.net/memoryprice.htm). Assuming you needed 4GB for inference, RAM would have cost $100M in 1978 dollars, or $470M in today's dollars.

For comparison, the Cray cost $7M in 1978 which is $32M in today's dollars. So once you buy a Cray you would have had to spend 14 times that amount on building a custom RAM device extension of 4GB, somehow hooked to the Cray, to finally be able to generate one sentence every 2.5 hours...

But in 1978, even if RAM was available to do LLM inference, it would have been impossible to train the model, as vastly more compute power is needed than for inference.

thoughtsimple · 2 years ago
That was on a 700 MHz Raspberry Pi 1. On an 1800 MHz Raspberry Pi 400 NEON SIMD the difference was another order of magnitude.

[QUOTE] Comparison - The three 700 MHz Pi 1 main measurements (Loops, Linpack and Whetstone) were 55, 42 and 94 MFLOPS, with the four gains over Cray 1 being 8.8 times for MHz and 4.6, 1.6, 15.7 times for MFLOPS.

The 2020 1800 MHz Pi 400 provided 819, 1147 and 498 MFLOPS, with MHz speed gains of 23 times and 69, 42 and 83 times for MFLOPS. With more advanced SIMD options, the 64 bit compilation produced Cray 1 MFLOPS gains of 78.8, 49.5 and 95.5 times.[/QUOTE]

d_sem · 2 years ago
Remarkable, yes. Shocking, no. Exponential growth was something experience in the computer industry for decades, and people where quite normalized to it.
klelatti · 2 years ago
Would we have been able to train the LLM in the first place though? Guessing that that would have been completely infeasible?
samstave · 2 years ago
This guy Time Travels. (check his hands, he likely has extra fingers)

But... lets look at the availability of DATA in the 80s..

Frankly, this is how hacking/phreaking was invented.

Dumpster-diving for line-printer discards in dumpsters to understand what their systems did.

(This is an actual story; people were bin dipping (at&t?) dumpsters and finding exploits (social or electronic) in the discarded line-printer outputs....

Can someone validate that comment?

--

bombcar · 2 years ago
The supercomputers of the time were very heavily designed to run floating-point operations (IIRC) and so while the FPU performance might be comparable, I'm not sure a Cray could be used as a "1/4 speed Pi" for general computing things like running Linux.
criddell · 2 years ago
I used a Cray around 1998 (from the Pittsburgh Supercomputing Center IIRC) and it was super fast on very particular tasks. Specifically, there was some type of processing pipeline that once you had it set up, it would produce a stream of calculations very quickly.

I wonder if the Raspberry Pi is faster on all tasks, or is there some type of computation the old Cray is still competitive?

kevin_thibedeau · 2 years ago
The shocking thing is that every contemporary PC and handheld device would place on the TOP500 list in the 90's yet they're still burdened with slow software when doing basic operations.
rightbyte · 2 years ago
Ye my fastest computer I will ever own was a Pentium 200MHz with 32Mb of ram running Windows 95.
JeffSnazz · 2 years ago
Given that it also functioned as sort of an uncomfortable couch—not really.

Besides, it's way further behind in basically every respect but compute.

GartzenDeHaes · 2 years ago
The other factor is RAM, which is more problematic. The Cray-1 had up to 4 Meg WORDS RAM, or 2 Meg as we would measure today (I think).
otabdeveloper4 · 2 years ago
Cray was 64 bit, IIRC, so 4 megawords would be 32 megabytes.
mytailorisrich · 2 years ago
> “The Raspberry Pi ... is more than 4.5 times faster than the Cray 1.”

The applications have grown as well.

The Cray 1 was used for mundane tasks like "large-scale scientific applications, such as simulating complex physical phenomena, and was sold to government and university laboratories." [1] But the power of the Raspberry Pi allows for cutting edge computing tasks like "watering plants, monitoring the birds in your yard, or for a smart doorbell!" [2]

[1] https://www.britannica.com/topic/Cray-1

[2] https://picockpit.com/raspberry-pi/the-7-most-common-uses-fo...

flenserboy · 2 years ago
Imagine the pain of programming a Cray-1 to do the work of a Raspberry Pi in those use instances.
bombcar · 2 years ago
Or using a Raspberry Pi to simulate nuclear explosions and the weather.
d_sem · 2 years ago
The absurdism to having a replica Cray-1 for the sole purpose of watering my plants is too amazing to not entertain.
reaperducer · 2 years ago
I know it's only January, but this strikes me as Comment of the Year.
taf2 · 2 years ago
I’m pretty sure it was here that I read how cray would drive on a family trip and insist the kids stay silent while he drove and designed much the cray in his head on the drive… if anyone has a link would love to re read that story
jtlienwis · 2 years ago
By the time the Cray-1 was designed, Seymour Crays children where in college. This might have applied to when he was designing the CDC 6600 when the children were younger. I know this because the Cray kids were a couple years ahead of me in high school. There are a lot of stories like this put out by John Rollwagen to build up Seymour's design creds.
jermaustin1 · 2 years ago
This is actually mildly surprising to me that the Raspberry Pi is only 4.5x faster. I would have bet 10-20x faster just because of how much time has passed and all the talk about: "your cellphone is 1000x faster than the Apollo computers" that I've been hearing since the time of the t-mobile sidekick.

I've always taken them with a grain of salt, but even if they were only an order of magnitude off, a Pi is loads faster than a sidekick. And sure the Cray is loads faster than the Apollo computers, but I wouldn't have thought it was THAT much faster.

I am amazed.

snek_case · 2 years ago
The Cray 1 was way ahead of its time. It implemented SIMD vector instructions to speed up numerical computation, was liquid cooled, and ran at 80 MHz (in 1976). The chief architect, Seymour Cray, was kind of a genius, and is responsible for designing many other pioneering machines.

Also, fun fact, it didn't have a CPU. It used all discrete logic chips, and was wired by hand, with lots and lots of wire. IIRC Seymour Cray liked to hire women to do the wiring job, because they had an easier time fitting inside the computer core to wire it, and doing detailed work because they had smaller hands.

msla · 2 years ago
> Also, fun fact, it didn't have a CPU. It used all discrete logic chips, and was wired by hand,

That's how CPUs were built back then. What it didn't have was a single-chip CPU, or a microprocessor.

Next up: "The Model T Ford didn't have an engine, it had this gasoline-burning device to provide motive power."

drfuchs · 2 years ago
Lots and lots of wire, yes, with each one cut to a specific length so that all of each gate's input signals arrive at the same time. That's why the backplane looks like a mess of wires hanging down due to seemingly being way too long.
peterfirefly · 2 years ago
> SIMD vector instructions

Err...

That term is used to describe modern vector instructions that are NOT vector instructions in the classical Cray-sense. Modern CPUs use wide registers that can be regarded as vectors of 2/4/8/... values. Cray used memory-to-memory variable-length vectors.

I'm not saying that "SIMD vector instructions" is an incorrect description of what the Cray machines did, I'm saying that the term usually means something else today.

heelix · 2 years ago
The fluorinert coolant was really neat stuff too. Inert, so you could submerge electronics in it to cool them. I was able to find recycled coolant and used that to do the liquid cooled build for my dual celeron system where coolant flowed over the board/cpu. Knowing what I do now... I should have been far less cavalier with the stuff.
whizzter · 2 years ago
And that kinda makes it unfair to the Raspberry, maybe not the RPI1 but I'm pretty sure that the RPi4 (rated at 78x the Cray) running code in compute-mode would thrash the Cray 1 even harder (yes, not entirely comparable but since you probably didn't get full throughput on the Cray with regular C code then using the GPU on the RPi should be fair game).
tass · 2 years ago
You’re not wrong, since the 4.5x number comes from a comparison with the rPi 1.

The rPi 4 is over 10x faster than the 1.

The original article points out that the raspberry pi 400, with the benchmark targeting 64 bits, was about 80x Cray for the same benchmark which was 4.5x for the rPi 1.

Raspberry Pi 5 advertises a 2-3x performance gain over earlier versions, though this wasn’t benchmarked in these tests.

postmodest · 2 years ago
Aren't these also in FLOPS? The rpi isn't exactly a floating-point monster. Hell, until the Pentium, Intel CPU's weren't guaranteed to even have an FPU.

As to ARM in general, here's a post about how ARM-1 compared to the 387: https://retrocomputing.stackexchange.com/questions/24826/did...

robin_reala · 2 years ago
The Cray-1 was ~160,000,000 FLOPS. The Apollo Guidance Computer was about 14,000 FLOPS, so about 11,500× slower.
jermaustin1 · 2 years ago
I guess it was THAT much faster. That is an insane leap in performance in a decade. I know they are apples and oranges. One weighed about 150x more than the other, but still such a surprise to me.
bee_rider · 2 years ago
It is against a Pi 1.

It looks like the Cray gets a bit of a boost from the linpack scores (the pi is only 1.6x faster!), which is a good test for the Cray (understatement!).

flavius29663 · 2 years ago
Apllo was not the fastest computer in the world. It was a small device enough to shuttle rockets around the moon and back.
spintin · 2 years ago
We now have a C64 as powerful as the Cray-1 is what you could say if you traveled back in time.
aw4y · 2 years ago
the main difference is that I cannot hold 10 crays in my desk drawer waiting to use them :P
Frenchgeek · 2 years ago
https://www.chrisfenton.com/homebrew-cray-1a/ (You'll still need a pretty big drawer)