Typically speaking an LLM is the code driving the control flow and the MCP servers are kind of dumb API endpoints (find_flights, search_hotels, etc) say for a travel MCP.
With your product, how is the LLM made aware of the underlying data store in a more useful way than “func search(query)”?
It seems to be that if you could expose some precomputed API structure into the MCP for a given data store then the LLM could reason more effectively about the data rather than throwing search queries into the void and hoping for the best?
As a corollary, once you add in self-play with random variation, the synthetic data problem is solved for coding, math, and some classes of scientific reasoning. No more modal collapse, no more massive teams of PhDs needed for human labeling, as long as you have a reliable metric for answer quality.
This isn't just neat, it's important - as we run out of useful human-generated data, RL scaling is the best candidate to take over where pretraining left off.