Basic usage of tool use (function calling)

Overview

Tool use is a technique which allows developers to connect Cohere’s Command family models to external tools like search engines, APIs, functions, databases, etc.

This opens up a richer set of behaviors by leveraging tools to access external data sources, taking actions through APIs, interacting with a vector database, querying a search engine, etc., and is particularly valuable for enterprise developers, since a lot of enterprise data lives in external sources.

The Chat endpoint comes with built-in tool use capabilities such as function calling, multi-step reasoning, and citation generation.

Tool use overview

Setup

First, import the Cohere library and create a client.

PYTHON
1# ! pip install -U cohere
2import cohere
3
4co = cohere.ClientV2(
5 "COHERE_API_KEY"
6) # Get your free API key here: https://dashboard.cohere.com/api-keys

Tool definition

The pre-requisite, or Step 0, before we can run a tool use workflow, is to define the tools. We can break this further into two steps:

  • Creating the tool
  • Defining the tool schema
Tool definition

Creating the tool

A tool can be any function that you create or external services that return an object for a given input. Some examples: a web search engine, an email service, an SQL database, a vector database, a weather data service, a sports data service, or even another LLM.

In this example, we define a get_weather function that returns the temperature for a given query, which is the location. You can implement any logic here, but to simplify the example, here we are hardcoding the return value to be the same for all queries.

PYTHON
1def get_weather(location):
2 # Implement any logic here
3 return [{"temperature": "20°C"}]
4 # Return a JSON object string, or a list of tool content blocks e.g. [{"url": "abc.com", "text": "..."}, {"url": "xyz.com", "text": "..."}]
5
6
7functions_map = {"get_weather": get_weather}

The Chat endpoint accepts a string or a list of objects as the tool results. Thus, you should format the return value in this way. The following are some examples.

PYTHON
1# Example: String
2weather_search_results = "20°C"
3
4# Example: List of objects
5weather_search_results = [
6 {"city": "Toronto", "date": "250207", "temperature": "20°C"},
7 {"city": "Toronto", "date": "250208", "temperature": "21°C"},
8]

Defining the tool schema

We also need to define the tool schemas in a format that can be passed to the Chat endpoint. The schema follows the JSON Schema specification and must contain the following fields:

  • name: the name of the tool.
  • description: a description of what the tool is and what it is used for.
  • parameters: a list of parameters that the tool accepts. For each parameter, we need to define the following fields:
    • type: the type of the parameter.
    • properties: the name of the parameter and the following fields:
      • type: the type of the parameter.
      • description: a description of what the parameter is and what it is used for.
    • required: a list of required properties by name, which appear as keys in the properties object

This schema informs the LLM about what the tool does, and the LLM decides whether to use a particular tool based on the information that it contains.

Therefore, the more descriptive and clear the schema, the more likely the LLM will make the right tool call decisions.

In a typical development cycle, some fields such as name, description, and properties will likely require a few rounds of iterations in order to get the best results (a similar approach to prompt engineering).

PYTHON
1tools = [
2 {
3 "type": "function",
4 "function": {
5 "name": "get_weather",
6 "description": "gets the weather of a given location",
7 "parameters": {
8 "type": "object",
9 "properties": {
10 "location": {
11 "type": "string",
12 "description": "the location to get the weather, example: San Francisco.",
13 }
14 },
15 "required": ["location"],
16 },
17 },
18 },
19]
The endpoint supports a subset of the JSON Schema specification. Refer to the Structured Outputs documentation for the list of supported and unsupported parameters.

Tool use workflow

We can think of a tool use system as consisting of four components:

  • The user
  • The application
  • The LLM
  • The tools

At its most basic, these four components interact in a workflow through four steps:

  • Step 1: Get user message: The LLM gets the user message (via the application).
  • Step 2: Generate tool calls: The LLM decides which tools to call (if any) and generates the tool calls.
  • Step 3: Get tool results: The application executes the tools, and the results are sent to the LLM.
  • Step 4: Generate response and citations: The LLM generates the response and citations back to the user.
Tool use workflow

As an example, a weather search workflow might looks like the following:

  • Step 1: Get user message: A user asks, “What’s the weather in Toronto?”
  • Step 2: Generate tool calls: A tool call is made to an external weather service with something like get_weather(“toronto”).
  • Step 3: Get tool results: The weather service returns the results, e.g. “20°C”.
  • Step 4: Generate response and citations: The model provides the answer, “The weather in Toronto is 20 degrees Celcius”.

The following sections go through the implementation of these steps in detail.

Step 1: Get user message

In the first step, we get the user’s message and append it to the messages list with the role set to user.

PYTHON
1messages = [
2 {"role": "user", "content": "What's the weather in Toronto?"}
3]

Optional: If you want to define a system message, you can add it to the messages list with the role set to system.

PYTHON
1system_message = """## Task & Context
2You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging.
3
4## Style Guide
5Unless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.
6"""
7
8messages = [
9 {"role": "system", "content": system_message},
10 {"role": "user", "content": "What's the weather in Toronto?"},
11]

Step 2: Generate tool calls

Next, we call the Chat endpoint to generate the list of tool calls. This is done by passing the parameters model, messages, and tools to the Chat endpoint.

The endpoint will send back a list of tool calls to be made if the model determines that tools are required. If it does, it will return two types of information:

  • tool_plan: its reflection on the next steps it should take, given the user query.
  • tool_calls: a list of tool calls to be made (if any), together with auto-generated tool call IDs. Each generated tool call contains:
    • id: the tool call ID
    • type: the type of the tool call (function)
    • function: the function to be called, which contains the function’s name and arguments to be passed to the function.

We then append these to the messages list with the role set to assistant.

PYTHON
1response = co.chat(
2 model="command-r-plus-08-2024", messages=messages, tools=tools
3)
4
5if response.message.tool_calls:
6 messages.append(
7 {
8 "role": "assistant",
9 "tool_plan": response.message.tool_plan,
10 "tool_calls": response.message.tool_calls,
11 }
12 )
13 print(response.message.tool_plan, "\n")
14 print(response.message.tool_calls)

Example response:

1I will search for the weather in Toronto.
2
3[
4 ToolCallV2(
5 id="get_weather_1byjy32y4hvq",
6 type="function",
7 function=ToolCallV2Function(
8 name="get_weather", arguments='{"location":"Toronto"}'
9 ),
10 )
11]

By default, when using the Python SDK, the endpoint passes the tool calls as objects of type ToolCallV2 and ToolCallV2Function. With these, you get built-in type safety and validation that helps prevent common errors during development.

Alternatively, you can use plain dictionaries to structure the tool call message.

These two options are shown below.

PYTHON
1messages = [
2 {
3 "role": "user",
4 "content": "What's the weather in Madrid and Brasilia?",
5 },
6 {
7 "role": "assistant",
8 "tool_plan": "I will search for the weather in Madrid and Brasilia.",
9 "tool_calls": [
10 ToolCallV2(
11 id="get_weather_dkf0akqdazjb",
12 type="function",
13 function=ToolCallV2Function(
14 name="get_weather",
15 arguments='{"location":"Madrid"}',
16 ),
17 ),
18 ToolCallV2(
19 id="get_weather_gh65bt2tcdy1",
20 type="function",
21 function=ToolCallV2Function(
22 name="get_weather",
23 arguments='{"location":"Brasilia"}',
24 ),
25 ),
26 ],
27 },
28]

The model can decide to not make any tool call, and instead, respond to a user message directly. This is described here.

The model can determine that more than one tool call is required. This can be calling the same tool multiple times or different tools for any number of calls. This is described here.

Step 3: Get tool results

During this step, we perform the function calling. We call the necessary tools based on the tool call payloads given by the endpoint.

For each tool call, we append the messages list with:

  • the tool_call_id generated in the previous step.
  • the content of each tool result with the following fields:
    • type which is document
    • document containing
      • data: which stores the contents of the tool result.
      • id (optional): you can provide each document with a unique ID for use in citations, otherwise auto-generated
PYTHON
1import json
2
3if response.message.tool_calls:
4 for tc in response.message.tool_calls:
5 tool_result = functions_map[tc.function.name](
6 **json.loads(tc.function.arguments)
7 )
8 tool_content = []
9 for data in tool_result:
10 # Optional: the "document" object can take an "id" field for use in citations, otherwise auto-generated
11 tool_content.append(
12 {
13 "type": "document",
14 "document": {"data": json.dumps(data)},
15 }
16 )
17 messages.append(
18 {
19 "role": "tool",
20 "tool_call_id": tc.id,
21 "content": tool_content,
22 }
23 )

Step 4: Generate response and citations

By this time, the tool call has already been executed, and the result has been returned to the LLM.

In this step, we call the Chat endpoint to generate the response to the user, again by passing the parameters model, messages (which has now been updated with information fromthe tool calling and tool execution steps), and tools.

The model generates a response to the user, grounded on the information provided by the tool.

We then append the response to the messages list with the role set to assistant.

PYTHON
1response = co.chat(
2 model="command-r-plus-08-2024", messages=messages, tools=tools
3)
4
5messages.append(
6 {"role": "assistant", "content": response.message.content[0].text}
7)
8
9print(response.message.content[0].text)

Example response:

1It's 20°C in Toronto.

It also generates fine-grained citations, which are included out-of-the-box with the Command family of models. Here, we see the model generating two citations, one for each specific span in its response, where it uses the tool result to answer the question.

PYTHON
1print(response.message.citations)

Example response:

1[Citation(start=5, end=9, text='20°C', sources=[ToolSource(type='tool', id='get_weather_1byjy32y4hvq:0', tool_output={'temperature': '20C'})], type='TEXT_CONTENT')]

Above, we assume the model performs tool calling only once (either single call or parallel calls), and then generates its response. This is not always the case: the model might decide to do a sequence of tool calls in order to answer the user request. This means that steps 2 and 3 will run multiple times in loop. It is called multi-step tool use and is described here.

State management

This section provides a more detailed look at how the state is managed via the messages list as described in the tool use workflow above.

At each step of the workflow, the endpoint requires that we append specific types of information to the messages list. This is to ensure that the model has the necessary context to generate its response at a given point.

In summary, each single turn of a conversation that involves tool calling consists of:

  1. A user message containing the user message
    • content
  2. An assistant message, containing the tool calling information
    • tool_plan
    • tool_calls
      • id
      • type
      • function (consisting of name and arguments)
  3. A tool message, containing the tool results
    • tool_call_id
    • content containing a list of documents where each document contains the following fields:
      • type
      • document (consisting of data and optionally id)
  4. A final assistant message, containing the model’s response
    • content

These correspond to the four steps described above. The list of messages is shown below.

PYTHON
1for message in messages:
2 print(message, "\n")
1{
2 "role": "user",
3 "content": "What's the weather in Toronto?"
4}
5
6{
7 "role": "assistant",
8 "tool_plan": "I will search for the weather in Toronto.",
9 "tool_calls": [
10 ToolCallV2(
11 id="get_weather_1byjy32y4hvq",
12 type="function",
13 function=ToolCallV2Function(
14 name="get_weather", arguments='{"location":"Toronto"}'
15 ),
16 )
17 ],
18}
19
20{
21 "role": "tool",
22 "tool_call_id": "get_weather_1byjy32y4hvq",
23 "content": [{"type": "document", "document": {"data": '{"temperature": "20C"}'}}],
24}
25
26{
27 "role": "assistant",
28 "content": "It's 20°C in Toronto."
29}

The sequence of messages is represented in the diagram below.

Note that this sequence represents a basic usage pattern in tool use. The next page describes how this is adapted for other scenarios.

Built with