Tool Use (function calling)

I hope they put a bit more effort into this compared to OpenAI.

The most crucial things missing in OpenAI's implementation for me were:

- Authentication for the API by the user rather than the developer.

- Caching/retries/timeout control

- Way to run the API non-blocking in the background and incorporate results later.

- Dynamic API tools (use an API to provide the tools for a conversation) and API revisions (for instance by hosting the API spec under a URL/git).

paulgb · 2 years ago

For authentication, since the tool call itself actually runs on your own server, can’t you just look at who the authed user is that made the request?

oezi · 2 years ago

OpenAI doesn't give you a way to identify the user.

And even if they did, it would be poor UX to have the user have to visit our site first to connect their API accounts.

I also imagine many tools wouldn't run under the developers' control (of course you could relay over your server).

TZubiri · 2 years ago

Bro you are given a state of the art multi million dollar compute for like a couple of cents and you complain about not having it spoonfed to you.

You have an http api, implement all of this yourself, the devs can't read your mind.

You should be able to issue a request and do stuff before reading the response, boom non-blocking. If you can't handle low level, just use threads plus your favourite abstraction?

User API auth. Never seen this by an api provider, you are in charge of user auth, what do you even expect here?

Do your job, openai isn't supposed to magically solve this for you, you are not a consumer of magical solutions, you are now a provider of them

oezi · 2 years ago

OpenAI isn't offering a viable product as it currently stands. This is why we only saw toy usage with the Plugins API and now with tools as part of GPTs. Since OpenAI wants to own the front end of the GPTs there isn't any way to implement the parts which aren't there.

About non-blocking: I am asking for their tools API to not block the user from continuing the conversation while my tool works. You seem to be thinking about something else.

skywhopper · 2 years ago

I agree so much but the last line struck me as hilarious given that 90% of the hype around LLM-based AI is explicitly that people do believe it’s magical. People already believe this tech is on the verge of replacing doctors, programmers, writers, actors, accountants, and lawyers. Why shouldn’t they expect the boring stuff like auth pass-thru to be pre-solved? Surely the AI companies can just have their LLM generate the required code, right?

I'm not sure if I'll migrate my existing function calling code I've been using with Claude to this... I've been using a hand rolled cross-platform way of calling functions for hard coded workflows and autonomous agents across GPT, Claude and Gemini. It works for any sufficiently capable LLM model. And with a much more pleasant, ergonomic programming model which doesn't require defining the function definition again separately to the implementation.

Before Devon was released I started building a AI Software Engineer after reading the Google "Self-Discover Reasoning Structures" paper. I was always put off looking at the LangChain API so decided to quickly build a simple API that fit my design style. Once a repo is checked out, and its decided what files to edit, I delegate the code editing step to Aider. The runAgent loop updates the system prompt with the tool definitions which are auto-generated. The available tools can be updated at runtime. The system prompt tells the agents to respond in a particular format which is parsed for the next function call. The code ends up looking like:

  export async function main() {
 
   initWorkflowContext(workflowLLMs);

   const systemPrompt = readFileSync('ai-system', 'utf-8');
   const userPrompt = readFileSync('ai-in', 'utf-8'); //'Complete the JIRA issue: ABC-123'

   const tools = new Toolbox();
   tools.addTool('Jira', new Jira());
   tools.addTool('GoogleCloud', new GoogleCloud());
   tools.addTool('UtilFunctions', new UtilFunctions());
   tools.addTool('FileSystem', getFileSystem());
   tools.addTool('GitLabServer',new GitLabServer();
   tools.addTool('CodeEditor', new CodeEditor());
   tools.addTool('TypescriptTools', new TypescriptTools());

   await runAgent(tools, userPrompt, systemPrompt);
  }



  @funcClass(__filename)
  export class Jira {

   /**
    * Gets the description of a JIRA issue
    * @param {string} issueId the issue id (e.g XYZ-123)
    * @returns {Promise<string>} the issue description
    */
   @func
   @cacheRetry({scope: 'global', ttlSeconds: 60*10, retryable: isAxiosErrorRetryable })
   async getJiraDescription(issueId: string): Promise<string> {
     const response = await this.instance.get(`/issue/${issueId}`);
     return response.data.fields.description;

   }
  }

New tools/functions can be added by simply adding the @func decorator to a class method. The coding use case is just the beginning of what it could be used for.

I'm busy finishing up a few pieces and then I'll put it out as open source shortly!

fluffet · 2 years ago

That's awesome man. I'm also a little bit allergic to Langchain. Any way to help out? How can I find this when it's open source?

campers · 2 years ago

I've added contact details to my profile for the moment, drop me an email

zby · 2 years ago

I have a library with similar api but in python: https://github.com/zby/LLMEasyTools. Even the names match.

campers · 2 years ago

That looks like a nice concise API too. Naming is always tricky, I like the toolbox name, but then should I rename the @func decorator to @tool? It seems like function is the more common name for it, which also overloads with the JavaScript function keyword.

joskanius · 2 years ago

Excellent! Looking forward to play with it.

bonko · 2 years ago

Love your approach! Can't wait to try this out.

linkedinviewer3 · 2 years ago

This is cool

bionhoward · 2 years ago

Here's the only reason you need to avoid Anthropic entirely, as well as OpenAI, Microsoft, and Google who all have similar customer noncompetes:

> You may not access or use the Services in the following ways:

> ● To develop any products or services that supplant or compete with our Services, including to develop or train any artificial intelligence or machine learning algorithms or models

There is only one viable option in the whole AI industry right now:

Mistral

depr · 2 years ago

Funny how they all used millions (?) of texts, without permission, to base their models on, and if you want to train your own model based on theirs which only works because of texts they used for free, that is prohibited.

swyx · 2 years ago

hotel california rules

hmry · 2 years ago

I think this is a great idea. May I suggest this for the new VSCode ToS: "You aren't allowed to use our products to write competing text editors". Maybe ban researching competing browser development using Chrome. The future sure is exciting.

imranq · 2 years ago

I think 99% of users aren't trying to train their own LLM with their data

nmcfarl · 2 years ago

However anyone that uses Claude to generating code is 'supplanting' OpenAI's Code Interpreter mode (at the very least if it's python). So, once Code Interpreter gets into Claude, that whole use case violates the TOS.

kristjansson · 2 years ago

Reminder that OpenAI's terms are much more reasonable:

> (e) use Output (as defined below) to develop any artificial intelligence models that compete with our products and services. However, you can use Output to (i) develop artificial intelligence models primarily intended to categorize, classify, or organize data (e.g., embeddings or classifiers), as long as such models are not distributed or made commercially available to third parties and (ii) fine tune models provided as part of our Services;

Where do you see that? I only see “e” and no “however”:

> For example, you may not:

> Use Output to develop models that compete with OpenAI.

That’s even less reasonable than Anthropic because “develop models that compete” is vague

Y_Y · 2 years ago

What about Meta or H20?

dartos · 2 years ago

Never heard of H2O, but llama has a restrictive license. Granted it’s like “as long as you have fewer than 70M users” or something crazy like that.

It’s a “use can use this as long as you not a threat and/or you’re an acquisition target” type license.

ametrau · 2 years ago

Is that legally enforceable?

hubraumhugo · 2 years ago

> All models can handle correcting choosing a tool from 250+ tools provided the user query contains all necessary parameters for the intended tool with >90% accuracy.

This is pretty exciting news for everybody working with agentic systems. OpenAI has way lower recall.

I'm now migrating from GPT function calls to Claude tools and will report back on the evaluation results.

mmoustafa · 2 years ago

Claude's [new] tool usage is pretty good. Unlike with GPT-4 where I had to really minimize the context and descriptions for each tool, Claude Opus does better when provided more details and context for each tool, much more nuanced.

I'm now using it with 9 different tools for https://olly.bot and it hits the nail on the head about 8/10 times. Anthropic says it can handle 250+ tools with 90% accuracy [1], but anecdotally from my production usage in the last 24 hours that seems a little too optimistic.

Annnd, it also comes with a few idiosyncracies like sometimes spitting out <thinking> or <answer> blocks, and has more constraints on the messages field, so don't expect a drop-in replacement for OpenAI.

[1] https://docs.anthropic.com/claude/docs/tool-use

cpursley · 2 years ago

Olly is really neat, I just set up a chat with it. How did you architect the web search (tools?) if you don't mind sharing?

iAkashPaul · 2 years ago

You should do the new HF TGI server, it has both grammar & tool support now. Works fabulously with Mistral Instruct & Mixtral Instruct.

Takennickname · 2 years ago

Whats grammar support?

vorticalbox · 2 years ago

I thought this too where it will usually pick stuff listed first rather than a more suitable tool down in the list.

Sometimes it will out right state it can't do that then after saying "use the browse_website tool"

It will magically remember it has the tool.

danenania · 2 years ago

I'm looking forward to trying this out with Plandex[1] (a terminal-based AI coding tool I recently launched that can build large features).

Plandex does rely on OpenAI's streaming function calls for its build progress indicators, so the lack of streaming is a bit unfortunate. But great to hear that it will be included in GA.

I've been getting a lot of requests to support Claude, as well as open source models. A humble suggestion for folks working on models: focus on full compatibility with the OpenAI API as soon as you can, including function calls and streaming function calls. Full support for function calls is crucial for building advanced functionality.

1 - https://github.com/plandex-ai/plandex

rcarmo · 2 years ago

I do hope we converge on a standardized API and schema for this. Testing and integrating multiple LLMs is tiresome with all the silly little variations in API and prompt formatting.

habosa · 2 years ago

OpenRouter is a great step in that direction: https://openrouter.ai/

Deleted Comment

ilaksh · 2 years ago

It looks very similar if not identical to OpenAI?

sdeep27 · 2 years ago

check out LiteLLM... been using in (lite) production and they make it easy to switch between models with a standardized API.

Langchain.

But it's too bleeding edge, you are asking a lot.

Just do the work and don't be spoiled senseless

Langchain, for all its popularity, is some of the worst, most brittle Python code I’ve ever seen or tried to use, so I’d prefer to have things sorted out for me at the API level.

rpigab · 2 years ago

I've set it up this way: I've told Claude that whenever he doesn't know how to answer, he can ask ChatGPT instead. I've set up ChatGPT the same way, he can ask Claude if needed.

Now they always find an answer. Problem solved.

That's fun. How many times will they go back and forth? Do you ever get infinite loops?

mercurialsolo · 2 years ago

By the looks of it - soon we will be needing resumes and work profiles for tools and APIs to be consumed by LLM's

htrp · 2 years ago

Welcome to virtual employees, complete with virtual HR for hiring