The parameter count is more more useful and concrete information than anything OpenAI or their competitors have put into the name of their models.
The parameter count gives you a heuristic for estimating if you can run this model on your own hardware, and how capable you might expect it to be compared to the broader spectrum of smaller models.
It also allows you to easily distinguish between different sizes of model trained in the same way, but with more parameters. It’s likely there is a higher parameter count model in the works and this makes it easy to distinguish between the two.
in this case it looks like this is the higher parameter count version, the 2b was released previously. (Not that it excludes them from making an even larger one in the future, altho that seems atypical of video/image/audio models)
re: GP: I sincerely wish 'Open'AI were this forthcoming with things like param count. If they have a 'b' in their naming, it's only to distingish it from the previous 'a' version, and don't ask me what an 'o' is supposed to mean.
Also no human is anywhere close to being as knowledgeable and skilled as LLMs at all the things at the same time, so it hardly even compares.
You probably want to replace Llama with Qwen in there. And Gemma is not even close.
> Mistral has been consistently last place, or at least last place among ChatGPT, Claude, Llama, and Gemini/Gemma.
Mistral held for a long time the position of "workhorse open-weights base model" and nothing precludes them from taking it again with some smart positioning.
They might not currently be leading a category, but as an outside observer I could see them (like Cohere) actively trying to find innovative business models to survive, reach PMF and keep the dream going, and I find that very laudable. I expect them to experiment a lot during this phase, and that probably means not doubling down on any particular niche until they find a strong signal.
Have you tried the latest, gemma3? I've been pretty impressed with it. Altho I do agree that qwen3 quickly overshadowed it, it seems too soon to dismiss it altogether. EG, the 3~4b and smaller versions of gemma seem to freak out way less frequently than similar param size qwen versions, tho I haven't been able to rule out quant and other factors in this just yet.
It's very difficult to fault anyone for not keeping up with the latest SOTA in this space. The fact we have several options that anyone can serviceably run, even on mobile, is just incredible.
Anyway, i agree that Mistral is worth keeping an eye on. They played a huge part in pushing the other players toward open weights and proving smaller models can have a place at the table. While I personally can't get that excited about a closed model, it's definitely nice to see they haven't tapped out.
People still don’t know how LLMs work and think they can be trained by interacting with them at the API level.
https://www.wikiwand.com/en/articles/Scavengers_Reign
This limited series blew my mind. Total master piece.
In favor of integrating fungus with robotics(i think).