[4] against, 1.2GHz quad-core Intel Core i7-based MacBook Air
[4] against, 1.2GHz quad-core Intel Core i7-based MacBook Air
Deleted Comment
The goals don’t matter.
The people don’t matter.
The only thing that matters is how much regulatory red tape is involved.
My guess is that the paperwork will kill this. Read the announcement. Too much discussion about regulatory framework. In the US or China, all you need is some money and smart people. That’s a very low barrier to getting moving forward.
No way
Complete hardware + software setup for running Deepseek-R1 locally. The actual model, no distillations, and Q8 quantization for full quality. Total cost, $6,000. All download and part links below:
Motherboard: Gigabyte MZ73-LM0 or MZ73-LM1. We want 2 EPYC sockets to get a massive 24 channels of DDR5 RAM to max out that memory size and bandwidth. https://t.co/GCYsoYaKvZ
CPU: 2x any AMD EPYC 9004 or 9005 CPU. LLM generation is bottlenecked by memory bandwidth, so you don't need a top-end one. Get the 9115 or even the 9015 if you really want to cut costs https://t.co/TkbfSFBioq
RAM: This is the big one. We are going to need 768GB (to fit the model) across 24 RAM channels (to get the bandwidth to run it fast enough). That means 24 x 32GB DDR5-RDIMM modules. Example kits: https://t.co/pJDnjxnfjg https://t.co/ULXQen6TEc
Case: You can fit this in a standard tower case, but make sure it has screw mounts for a full server motherboard, which most consumer cases won't. The Enthoo Pro 2 Server will take this motherboard: https://t.co/m1KoTor49h
PSU: The power use of this system is surprisingly low! (<400W) However, you will need lots of CPU power cables for 2 EPYC CPUs. The Corsair HX1000i has enough, but you might be able to find a cheaper option: https://t.co/y6ug3LKd2k
Heatsink: This is a tricky bit. AMD EPYC is socket SP5, and most heatsinks for SP5 assume you have a 2U/4U server blade, which we don't for this build. You probably have to go to Ebay/Aliexpress for this. I can vouch for this one: https://t.co/51cUykOuWG
And if you find the fans that come with that heatsink noisy, replacing with 1 or 2 of these per heatsink instead will be efficient and whisper-quiet: https://t.co/CaEwtoxRZj
And finally, the SSD: Any 1TB or larger SSD that can fit R1 is fine. I recommend NVMe, just because you'll have to copy 700GB into RAM when you start the model, lol. No link here, if you got this far I assume you can find one yourself!
And that's your system! Put it all together and throw Linux on it. Also, an important tip: Go into the BIOS and set the number of NUMA groups to 0. This will ensure that every layer of the model is interleaved across all RAM chips, doubling our throughput. Don't forget!
Now, software. Follow the instructions here to install llama.cpp https://t.co/jIkQksXZzu
Next, the model. Time to download 700 gigabytes of weights from @huggingface! Grab every file in the Q8_0 folder here: https://t.co/9ni1Miw73O
Believe it or not, you're almost done. There are more elegant ways to set it up, but for a quick demo, just do this. llama-cli -m ./DeepSeek-R1.Q8_0-00001-of-00015.gguf --temp 0.6 -no-cnv -c 16384 -p "<|User|>How many Rs are there in strawberry?<|Assistant|>"
If all goes well, you should witness a short load period followed by the stream of consciousness as a state-of-the-art local LLM begins to ponder your question:
And once it passes that test, just use llama-server to host the model and pass requests in from your other software. You now have frontier-level intelligence hosted entirely on your local machine, all open-source and free to use!
And if you got this far: Yes, there's no GPU in this build! If you want to host on GPU for faster generation speed, you can! You'll just lose a lot of quality from quantization, or if you want Q8 you'll need >700GB of GPU memory, which will probably cost $100k+
> Product icon is required
If you inspect the IndexedDB when logged in, you can see that everything is stored locally already. Offline mode was planned from the very beginning. It will and can already work offline if I spare a couple of days on it. But I didn't see it as a priority right now.
Building this for myself mainly, but hoping others might find it useful. Still very early and building out the bear essentials, but then the hope is to keep reading marketing books and use that to improve the platform.