The model looks good for an open source model. I want to see how these models are trained. may be they have a base model from academic datasets and quickly fine-tune with models like nano banana pro or something? That could be the game for such models. But great to see an open source model competing with the big players.
this is a really neat project: "an automated, daily evaluation suite to track model performance over time, monitor for regression during peak load periods, and detect quality changes across flagship LLM APIs."