Got the impression that a bazillion dollar's worth of GPUs are required to run a cutting-edge chatbot? Think again. Matthew Carrigan, an engineer at AI tools outfit HuggingFace, claims that you can run the hot new DeepSeek R1 LLM on just $6,000 of PC hardware. The kicker? You don't even need a high-end GPU.
Carrigan's suggested build involves a dual-socket AMD EPYC motherboard and a couple of compatible AMD chips to go with it. Apparently, the spec of the CPUs isn't actually that critical. Instead, it's all about the memory.
Complete hardware + software setup for running Deepseek-R1 locally. The actual model, no distillations, and Q8 quantization for full quality. Total cost, $6,000. All download and part links below:January 28, 2025
«We are going to need 768GB (to fit the model) across 24 RAM channels (to get the bandwidth to run it fast enough). That means 24 x 32 GB DDR5-RDIMM modules,» Carrigan explains.
Links are helpfully provided and the RAM alone comes to about $3,400. Then you'll need a case, PSU, a mere 1 TB SSD, some heatsinks and fans.
Indeed, Carrigan says this setup gets you the full DeepSeek R1 experience with no compromises. «The actual model, no distillations, and Q8 quantization for full quality,» he explains.
From there, simply «throw» on Linux, install llama.cpp, download 700 GB of weights, input a command line string Carrigan helpfully provides and Bob's your large language model running locally, as they say.
Notable in all this is a total absence of mention of expensive Nvidia GPUs. So what gives? Well, Carrigan provides a video of the LLM running locally on this setup plus a rough performance metric.
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
«The generation speed on this build is 6 to 8 tokens per second, depending on the specific CPU and RAM speed you get, or slightly less if you have a long chat history. The clip above is near-realtime, sped up slightly to fit video length limits,» he
Read more on pcgamer.com