It feels a bit like the original Segway’s over-engineered solution versus cheap Chinese hoverboards, then the scooters and e-bikes that took over afterwards.
Why would I be paying all this money for this realistic telepresence when my shitbox HP laptop from Walmart has a perfectly serviceable webcam?
> The model will respond with a JSON object that strictly follows your schema
Gemini is listed as a model supporting structured output, and yet its fail rate is 0.39% (Gemini 2.0 Flash)!! I get that structured output has a high performance cost but advertising it as supported when in reality it's not is a massive red flag.
Worst yet response healing only fixes JSON syntax error, not schema adherence. This is only mentioned at the end of the article which people are clearly not going to read.
WTF
llguidance is used in vLLM, SGLang, internally at OpenAI and elsewhere. At the same time, I also see a non-trivial JSON error rate from Gemini models in large scale synthetic generations, so perhaps Google hasn't seen the "llight" yet and are using something less principled.
1: https://guidance-ai.github.io/llguidance/llg-go-brrr 2: https://github.com/guidance-ai/llguidance