Usage patterns for tool use (function calling)
The tool use feature of the Chat endpoint comes with a set of capabilities that enable developers to implement a variety of tool use scenarios. This section describes the different patterns of tool use implementation supported by these capabilities. Each pattern can be implemented on its own or in combination with the others.
Setup
First, import the Cohere library and create a client.
Cohere platform
Private deployment
We’ll use the same get_weather
tool as in the previous example.
Parallel tool calling
The model can determine that more than one tool call is required, where it will call multiple tools in parallel. This can be calling the same tool multiple times or different tools for any number of calls.
In the example below, the user asks for the weather in Toronto and New York. This requires calling the get_weather
function twice, one for each location. This is reflected in the model’s response, where two parallel tool calls are generated.
Example response:
State management
When tools are called in parallel, we append the messages list with one single assistant
message containing all the tool calls and one tool
message for each tool call.
The sequence of messages is represented in the diagram below.
Directly answering
A key attribute of tool use systems is the model’s ability to choose the right tools for a task. This includes the model’s ability to decide to not use any tool, and instead, respond to a user message directly.
In the example below, the user asks for a simple arithmetic question. The model determines that it does not need to use any of the available tools (only one, get_weather
, in this case), and instead, directly answers the user.
Example response:
State management
When the model opts to respond directly to the user, there will be no items 2 and 3 above (the tool calling and tool response messages). Instead, the final assistant
message will contain the model’s direct response to the user.
Note: you can force the model to directly answer every time using the tool_choice
parameter, described here
Multi-step tool use
The Chat endpoint supports multi-step tool use, which enables the model to perform sequential reasoning. This is especially useful in agentic workflows that require multiple steps to complete a task.
As an example, suppose a tool use application has access to a web search tool. Given the question “What was the revenue of the most valuable company in the US in 2023?”, it will need to perform a series of steps in a specific order:
- Identify the most valuable company in the US in 2023
- Then only get the revenue figure now that the company has been identified
To illustrate this, let’s start with the same weather example and add another tool called get_capital_city
, which returns the capital city of a given country.
Here’s the function definitions for the tools:
And here are the corresponding tool schemas:
Next, we implement the four-step tool use workflow as described in the previous page.
The key difference here is the second (tool calling) and third (tool execution) steps are put in a while
loop, which means that a sequence of this pair can happen for a number of times. This stops when the model decides in the tool calling step that no more tool calls are needed, which then triggers the fourth step (response generation).
In this example, the user asks for the temperature in Brazil’s capital city.
The model first determines that it needs to find out the capital city of Brazil. Once it has this information, it proceeds with the next step in the sequence, which is to look up the temperature of that city.
This is reflected in the model’s response, where two tool calling-result pairs are generated in a sequence.
Example response:
State management
In a multi-step tool use scenario, instead of just one occurence of assistant
-tool
messages, there will be a sequence of assistant
-tool
messages to reflect the multiple steps of tool calling involved.
Forcing tool usage
As shown in the previous examples, during the tool calling step, the model may decide to either:
- make tool call(s)
- or, respond to a user message directly.
You can, however, force the model to choose one of these options. This is done via the tool_choice
parameter.
- You can force the model to make tool call(s), i.e. to not respond directly, by setting the
tool_choice
parameter toREQUIRED
. - Alternatively, you can force the model to respond directly, i.e. to not make tool call(s), by setting the
tool_choice
parameter toNONE
.
By default, if you don’t specify the tool_choice
parameter, then it is up to the model to decide whether to make tool calls or respond directly.
State management
Here’s the sequence of messages when tool_choice
is set to REQUIRED
.
Here’s the sequence of messages when tool_choice
is set to NONE
.
Chatbots (multi-turn)
Building chatbots requires maintaining the memory or state of a conversation over multiple turns. To do this, we can keep appending each turn of a conversation to the messages
list.
As an example, here’s the messages list from the first turn of a conversation.
Then, in the second turn, when provided with a rather vague follow-up user message, the model correctly infers that the context is about the weather.
Example response:
State management
The sequence of messages is represented in the diagram below.