Readit News logoReadit News
jhoho commented on The Danish Ministry of Digitalization Is Switching to Linux and LibreOffice   politiken.dk/viden/tech/a... · Posted by u/nogajun
ktallett · 3 months ago
I found onlyoffice a good replacement. It is certainly better looking.
jhoho · 3 months ago
It's really beautiful but I stopped using it because of its opaque ties with Russia. https://en.wikipedia.org/wiki/OnlyOffice#Organization
jhoho commented on Show HN: Real-time AI Voice Chat at ~500ms Latency   github.com/KoljaB/Realtim... · Posted by u/koljab
sabellito · 4 months ago
Every time I see these things, they look cool as hell, I get excited, then I try to get them working on my gaming PC (that has the GPU), I spend 1-2h fighting with python and give up.

Today's issue is that my python version is 3.12 instead of <3.12,>=3.9. Installing python 3.11 from the official website does nothing, I give up. It's a shame that the amazing work done by people like the OP gets underused because of this mess outside of their control.

"Just use docker". Have you tried using docker on windows? There's a reason I never do dev work on windows.

I spent most of my career in the JVM and Node, and despite the issues, never had to deal with this level of lack of compatibility.

jhoho · 4 months ago
Let me introduce you to the beautiful world of virtual environments. They save you the headache of getting a full installation to run, especially when using Windows.

I prefer miniconda, but venv also does the job.

jhoho commented on AMD's game-changing Strix Halo, formerly Ryzen AI Max, poses for new die shots   tomshardware.com/pc-compo... · Posted by u/rbanffy
chipsa · 7 months ago
More RAM, so less movement of the weights around to generate a token. Most of the speed limit on a LLM is bandwidth of getting the weights around. To a great extent, your token speed is approximately your (model size)/(effective bandwidth). If you need to shuffle the weights into VRAM from main RAM, you halve your speed (bandwidth used both to move into VRAM and out). If you need to pull the weights from disk, even worse.
jhoho · 7 months ago
While true, the benchmarks are not run on the Ryzen's NPU but the much stronger GPU.
jhoho commented on AMD's game-changing Strix Halo, formerly Ryzen AI Max, poses for new die shots   tomshardware.com/pc-compo... · Posted by u/rbanffy
Havoc · 7 months ago
>AMD also claims its Strix Halo APUs can deliver 2.2x more tokens per second than the RTX 4090 when running the Llama 70B LLM (Large Language Model) at 1/6th the TDP (75W).

That if true is wild.

jhoho · 7 months ago
It's because of the bigger VRAM - 70B parameters don't fit into the 4090's 24GB.

u/jhoho

KarmaCake day112October 22, 2019View Original