Readit News logoReadit News
b4rtazz commented on Deepseek R1 Distill 8B Q40 on 4 x Raspberry Pi 5   github.com/b4rtaz/distrib... · Posted by u/b4rtazz
NitpickLawyer · 6 months ago
As always, take those t/s stats with a huge boulder of salt. The demo shows a question "solved" in < 500 tokens. Still amazing that it's possible, but you'll get nowhere near those speeds when dealing with real-world problems at real-world useful context lengths for "thinking" models (8-16k tokens). Even epyc's with lots of channels go down to 2-4 t/s after ~4096 context length.
b4rtazz · 6 months ago
I checked how it performs in long run (prediction) on 4 x Raspberry Pi 5:

* pos=0 => P 138 ms S 864 kB R 1191 kB Connect

* pos=2000 => P 215 ms S 864 kB R 1191 kB .

* pos=4000 => P 256 ms S 864 kB R 1191 kB manager

* pos=6000 => P 335 ms S 864 kB R 1191 kB the

u/b4rtazz

KarmaCake day181September 11, 2020
About
https://twitter.com/b4rtaz
View Original