I work with ML professionally, almost all in cloud, I just wanted something “off grid” and unmetered, and needed a computer anyway so decided to pay a bit more and get the one I want. It’s “personal” in that it’s exclusively for me, but I have a business and bought it for that.
Still figuring out the best software, so far it looks like llama.cpp with Vulcan though I have a lot of experimenting to do and don’t currently find it optimal for what I want.
What is your target use case? Curious what feels suboptimal about llama.cpp + Vulkan so far.
Whats your stack?
And none of that hardware can run larger models, smaller tiny ones, or highly quantized versions of larger ones sure. Or do you have something important to say?
Our stack changes per project, adapting to client needs and infra: Llama 70B on a Mac Studio M1 with Ollama in 2024, vLLM on 4xH100 private cloud for larger deployments. Most recently, we've been working on a custom workstation with 2x RTX PRO 6000 Blackwell Max-Q + 1.1TB DDR5 to run larger models locally using SGLang and KTransformers.
The question isn't rhetorical, I'm trying to understand if the demand we see in regulated sectors is the whole market or if there's broader adoption I'm missing.
Hire learners, or hire people who teach people (evaluate new tools, write guides, conduct training, mentor, etc.).
I’m curious how you assess developers’ ability to leverage these tools efficiently during the recruiting process. Any tips to share? Any return on experience?
Also hard to justify the $50k capex compared to just hitting the Anthropic API. You'd need massive volume to break even on that hardware especially with electricity costs. Seems like overoptimization unless you have strict data privacy needs.