I kind of am missing the bridge between that, and the fundamental knowledge that everything is token based in and out.
Is it fair to say that the tool abstraction the library provides you is essentially some niceties around a prompt something like "Defined below are certain 'tools' you can use to gather data or perform actions. If you want to use one, please return the tool call you want and it's arguments, delimited before and after with '###', and stop. I will invoke the tool call and then reply with the output delimited by '==='".
Basically, telling the model how to use tools, earlier in the context window. I already don't totally understand how a model knows when to stop generating tokens, but presumably those instructions will get it to output the request for a tool call in a certain way and stop. Then the agent harness knows to look for those delimiters and extract out the tool call to execute, and then add to the context with the response so the LLM keeps going.
Is that basically it? Or is there more magic there? Are the tool call instructions in some sort of permanent context, or could the interaction demonstrated in a fine tuning step, and inferred by the model and just in its weights?
Structured Output APIs (inc. the Tool API) take the schema and build a Context-free Grammar, which is then used during generation to mask which tokens can be output.
I found https://openai.com/index/introducing-structured-outputs-in-t... (have to scroll down a bit to the "under the hood" section) and https://www.leewayhertz.com/structured-outputs-in-llms/#cons... to be pretty good resources
Well they've definitely recovered in NW Wisconsin. Theyre everywhere and the males won't even move out of the way of cars.
It's a particularly hard problem to solve - the hobby is usually spread through traditional means (you do it if your parents did it), and going all the way back in certain communities this was the main way to get meat, even before it became regulated. It's difficult to stop something that not only puts food on the table for your family, but has been done that way for generations.
This was one of the main contributors to the decline of the turkey population in the lower 48. In the early 1900's, a lot of folks thought turkey's were extinct because of over hunting and poaching, and the National Wild Turkey Foundation took efforts to restore the population for hunting.
If you have plumbing that's done in different metal materials (copper, steel, lead, etc.) and any of your pipes touch, you have to perform regular maintenance and apply a dielectric grease (another one of those single-use materials that you have to buy and store away) or your pipes could corrode and cause a ton of damage.
Thanks for open sourcing and sharing, excited to try this out!!
We think we stand out from our competitors in the space because we built first for the enterprise case, with consideration for things like data governance, acceptable use, and data privacy and information security that can be deployed in managed easily and reliably in customer-managed environments.
A lot of the products today have similar evaluations and metrics, but they either offer a SAAS solution or require some onerous integration into your application stack.
Because we started w/ the enterprise first, our goal was to get to value as quickly and as easily as possible (to avoid shoulder-surfing over zoom calls because we don't have access to the service), and think this plays out well with our product.
We based our hallucination detection on "groundedness" on a claim-by-claim basis, which evaluates whether the LLM response can be cited in provided context (eg: message history, tool calls, retrieved context from a vector DB, etc.)
We split the response into multiple claims, determine if a claim needs to be evaluated (eg: and isn't just some boilerplate) and then check to see if the claim is referenced in the context.