What would the energy use be for an average query be, when using large models at this speed?
I’ve asked that question on linked in to the Cerebras team a couple times and haven’t ever received a response. There is system max tdp values posted online but I’m not sure you can assume the system is running in max tdp for these queries. If it is the numbers are quite high (I just tried to find the number but couldn’t find it but I had it in my notes as 23kw).
If someone from Cerebras is reading this feel free to dm me as optimizing this power is what we do.
Energy is the #1 constraint in new datacenter buildouts. Neuralwatt is reshaping AI compute around energy efficiency to maximize revenue per kilowatt. We’re a VC-backed, early-stage startup building optimization tools for AI, HPC, and datacenter workloads.
We're hiring 2 founding engineers to help architect our core systems and work directly with customers.
What you'll do:
Technically: Architect critical datacenter infrastructure - Write Rust and Python - Measure real-world energy impact - Design state-of-the-art AI-led optimizations
Non-technically: Help build the business and win customers - Present at conferences - Develop marketing and company materials
Requirements:
- 5–10+ years of software development experience
- Thrive in ambiguous, outcome-driven environments
- Experience working closely with customers
- Clear communication and strong leadership
- Familiarity with LLM/AI infrastructure
Location: Remote-first, but we meet regularly in Seattle/Denver metro areas.
To apply: Email: scott@neuralwatt.com Subject: HN Hiring Include: - Resume - GitHub profile - A short note on why you're interested