I don't think that this is a dupe or anything and 3000 t/s is really cool, the other post just has more discussion of Cerebras and people's experiences with using GLM 4.6 for software development.
It’s an absolute beast. I run it via OpenRouter, where I have Groq and Cerebras as the providers. Cheap enough as to be almost free, strong performance, and lightning fast.
Cheap enough for now, but of all the companies selling inference at a loss, Cerebras and Groq are probably losing the most per token. Their hardware is ungodly expensive and its reliance on huge amounts of SRAM bottlenecks how much cheaper it can get, since SRAM density is improving at a snails pace at this point.
You're pointing out a bunch of high capex costs (hardware, SRAM), but then concluding that their opEx is greater than their revenue on a per unit basis. Are they really losing money on every token? It seems that using hardware acceleration would decrease inference costs and they could make it up on unit economics over time.
But I'm just reasoning from first principles. I don't have any specific data about them.
It's a decent general model too - I have it plugged up in llm and raycast since August at great speeds. I wish Cerebras would do MiniMax M2 which should be an upgrade and replacement if it was just faster. It would never be as fast as gpt-oss-120 though
This is really impressive. At these speeds, it’s possible to run agents with multi-tool turns within seconds. Consider it a feature rich, “non-deterministic API” for your platform or business.
I absolutely hate it, when a website says "try this" and after you went through the trouble of weiting something comes up with a sign up link first. Makes me leave instantly to never come back.
Headline at the top of the Cerebras page linked to by the OP "Cerebras Raises $1.1B Series G at $8.1B Valuation".
If you're going after the AI money gravy train then you need to wave the "we have $n registered users" carrot on your PPT slides for the investors because registered user == monetization opportunity.
I'm not defending it. I hate being forced to register for shit when I just want to try it or use the free tier.
Right, being proud of your money making is not something I consider a consumer focused product unless that customer is other moneyseeking orga, which like cancer, often ends up in a bubble.
This is like declaring that a Ferrari dealership offering you a free test drive in a million dollar art exhibit on wheels is evil for asking for your phone number before handing you the keys.
If this was some beat-to-hell, high-mileage used economy car, sure, that would be a pain in the ass, and not worth it. But it's a mistake to place Cerebras into that mental bucket.
You don't even need to use real information to create an account. Just grab a temp-mail disposable address and sign up as fred flintstone or mickey mouse.
If you're a heavy LLM inference user (i.e. if you've ever paid for a $200/mo sub from any of the big AI labs), I can damn near guarantee you will not regret trying out Cerebras.
A week ago I went to a launch party for a product that's supposed to "revolutionize design" (a web app w/ an OAI prompt).
No demo, only like two pictures of the actual product. Founder spent like half an hour giving a speech about the future, etc...
"All of you here will get access to it in a couple weeks."
Couple weeks go by ... I "get access". It's a .dmg, 1) What, I open it, it's not even an app, it's an installer ..., I install it, the app opens up and it's a giant red button that takes you to a website to create an account ...
I don't think that this is a dupe or anything and 3000 t/s is really cool, the other post just has more discussion of Cerebras and people's experiences with using GLM 4.6 for software development.
But I'm just reasoning from first principles. I don't have any specific data about them.
Deleted Comment
If you're going after the AI money gravy train then you need to wave the "we have $n registered users" carrot on your PPT slides for the investors because registered user == monetization opportunity.
I'm not defending it. I hate being forced to register for shit when I just want to try it or use the free tier.
But it is what it is.
If this was some beat-to-hell, high-mileage used economy car, sure, that would be a pain in the ass, and not worth it. But it's a mistake to place Cerebras into that mental bucket.
You don't even need to use real information to create an account. Just grab a temp-mail disposable address and sign up as fred flintstone or mickey mouse.
If you're a heavy LLM inference user (i.e. if you've ever paid for a $200/mo sub from any of the big AI labs), I can damn near guarantee you will not regret trying out Cerebras.
A week ago I went to a launch party for a product that's supposed to "revolutionize design" (a web app w/ an OAI prompt).
No demo, only like two pictures of the actual product. Founder spent like half an hour giving a speech about the future, etc...
"All of you here will get access to it in a couple weeks."
Couple weeks go by ... I "get access". It's a .dmg, 1) What, I open it, it's not even an app, it's an installer ..., I install it, the app opens up and it's a giant red button that takes you to a website to create an account ...
These guys are completely lost.