1. track metrics and have our own dashboards to ensure we proactively understand and act whenever something like that happens 2. also use these metrics in our routing to smartly know when to scale up. we have tested a lot of variations of all the metrics we gather and things are looking good
anyway, the more workload types we will host with this system, the more we know and the better/performant it will get. we're running this for a while now, and it shows great results.
there's no magic, just data coming from a complex system, fed into a fairly complex system!
hope that answers the question, and thanks for trusting us
Displaimer: CTO of Vercel here
Disclosure: CEO of Scorecard- AI eval platform, current Vercel customer. Intrigued since most of our time serverless time is spent waiting for model responses, but cautious about 'magic' solutions.
The biggest diffs from Claude code (the current champion): 1. Generous free tier (60 RPM!) 2. Open Source Apache (Standard after OAI Codex did the same)