Streaming for tool use (function calling)

Overview

To enable response streaming in tool use, use the chat_stream endpoint instead of chat.

You can stream responses in both the tool calling and the response generation steps. This allows your application to receive token streams as the model plans and executes tool calls and finally generates its response.

Events stream

In tool use, the events streamed by the endpoint follows the structure of a basic chat stream event but contains additional events for tool calling and response generation with the associated contents. This section describes the stream of events and their contents.

Tool calling step

Event types

message-start

Same as in a basic chat stream event.

tool-plan-delta

Emitted when the next token of the tool plan is generated.

tool-call-start

Emitted when the model generates tool calls that require actioning upon. The event contains a list of tool_calls containing the tool name and tool call ID of the tool.

tool-call-delta

Emitted when the next token of the the tool call is generated.

tool-call-end

Emitted when the tool call is finished.

When there are more than one tool calls being made (i.e. parallel tool calls), the sequence of tool-call-start, tool-call-delta, and tool-call-end events will repeat.

message-end

Same as in a basic chat stream event.

Example stream

The following is an example stream in the tool calling step.

1# User message
2"What's the weather in Madrid and Brasilia?"
3
4# Events stream
5type='message-start' id='fba98ad3-e5a1-413c-a8de-84fbf9baabf7' delta=ChatMessageStartEventDelta(message=ChatMessageStartEventDeltaMessage(role='assistant', content=[], tool_plan='', tool_calls=[], citations=[]))
6 --------------------------------------------------
7type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan='I'))
8 --------------------------------------------------
9type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' will'))
10 --------------------------------------------------
11type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' search'))
12 --------------------------------------------------
13type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' for'))
14 --------------------------------------------------
15type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' the'))
16 --------------------------------------------------
17type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' weather'))
18 --------------------------------------------------
19type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' in'))
20 --------------------------------------------------
21type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' Madrid'))
22 --------------------------------------------------
23type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' and'))
24 --------------------------------------------------
25type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' Brasilia'))
26 --------------------------------------------------
27type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan='.'))
28 --------------------------------------------------
29type='tool-call-start' index=0 delta=ChatToolCallStartEventDelta(message=ChatToolCallStartEventDeltaMessage(tool_calls=ToolCallV2(id='get_weather_p1t92w7gfgq7', type='function', function=ToolCallV2Function(name='get_weather', arguments=''))))
30 --------------------------------------------------
31type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='{\n "'))))
32 --------------------------------------------------
33type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='location'))))
34 --------------------------------------------------
35type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='":'))))
36 --------------------------------------------------
37type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments=' "'))))
38 --------------------------------------------------
39type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='Madrid'))))
40 --------------------------------------------------
41type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='"'))))
42 --------------------------------------------------
43type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='\n'))))
44 --------------------------------------------------
45type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='}'))))
46 --------------------------------------------------
47type='tool-call-end' index=0
48 --------------------------------------------------
49type='tool-call-start' index=1 delta=ChatToolCallStartEventDelta(message=ChatToolCallStartEventDeltaMessage(tool_calls=ToolCallV2(id='get_weather_ay6nmvjgp9vn', type='function', function=ToolCallV2Function(name='get_weather', arguments=''))))
50 --------------------------------------------------
51type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='{\n "'))))
52 --------------------------------------------------
53type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='location'))))
54 --------------------------------------------------
55type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='":'))))
56 --------------------------------------------------
57type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments=' "'))))
58 --------------------------------------------------
59type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='Bras'))))
60 --------------------------------------------------
61type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='ilia'))))
62 --------------------------------------------------
63type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='"'))))
64 --------------------------------------------------
65type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='\n'))))
66 --------------------------------------------------
67type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='}'))))
68 --------------------------------------------------
69type='tool-call-end' index=1
70 --------------------------------------------------
71type='message-end' id=None delta=ChatMessageEndEventDelta(finish_reason='TOOL_CALL', usage=Usage(billed_units=UsageBilledUnits(input_tokens=37.0, output_tokens=28.0, search_units=None, classifications=None), tokens=UsageTokens(input_tokens=913.0, output_tokens=83.0)))
72 --------------------------------------------------

Response generation step

Event types

message-start

Same as in a basic chat stream event.

content-start

Same as in a basic chat stream event.

content-delta

Same as in a basic chat stream event.

citation-start

Emitted for every citation generated in the response. This event contains the details about a citation such as the start and end indices of the text that cites a source(s), the corresponding text, and the list of sources.

citation-end

Emitted to indicate the end of a citation. If there are multiple citations generated, the events will come as a sequence of citation-start and citation-end pairs.

content-end

Same as in a basic chat stream event.

message-end

Same as in a basic chat stream event.

Example stream

The following is an example stream in the response generation step.

1"What's the weather in Madrid and Brasilia?"
2
3type='message-start' id='e8f9afc1-0888-46f0-a9ed-eb0e5a51e17f' delta=ChatMessageStartEventDelta(message=ChatMessageStartEventDeltaMessage(role='assistant', content=[], tool_plan='', tool_calls=[], citations=[]))
4 --------------------------------------------------
5type='content-start' index=0 delta=ChatContentStartEventDelta(message=ChatContentStartEventDeltaMessage(content=ChatContentStartEventDeltaMessageContent(text='', type='text')))
6 --------------------------------------------------
7type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='It'))) logprobs=None
8 --------------------------------------------------
9type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' is'))) logprobs=None
10 --------------------------------------------------
11type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' currently'))) logprobs=None
12 --------------------------------------------------
13type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' 2'))) logprobs=None
14 --------------------------------------------------
15type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='4'))) logprobs=None
16 --------------------------------------------------
17type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='°'))) logprobs=None
18 --------------------------------------------------
19type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='C in'))) logprobs=None
20 --------------------------------------------------
21type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' Madrid'))) logprobs=None
22 --------------------------------------------------
23type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' and'))) logprobs=None
24 --------------------------------------------------
25type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' 2'))) logprobs=None
26 --------------------------------------------------
27type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='8'))) logprobs=None
28 --------------------------------------------------
29type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='°'))) logprobs=None
30 --------------------------------------------------
31type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='C in'))) logprobs=None
32 --------------------------------------------------
33type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' Brasilia'))) logprobs=None
34 --------------------------------------------------
35type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='.'))) logprobs=None
36 --------------------------------------------------
37type='citation-start' index=0 delta=CitationStartEventDelta(message=CitationStartEventDeltaMessage(citations=Citation(start=16, end=20, text='24°C', sources=[ToolSource(type='tool', id='get_weather_m3kdvxncg1p8:0', tool_output={'temperature': '{"madrid":"24°C"}'})], type='TEXT_CONTENT')))
38 --------------------------------------------------
39type='citation-end' index=0
40 --------------------------------------------------
41type='citation-start' index=1 delta=CitationStartEventDelta(message=CitationStartEventDeltaMessage(citations=Citation(start=35, end=39, text='28°C', sources=[ToolSource(type='tool', id='get_weather_cfwfh3wzkbrs:0', tool_output={'temperature': '{"brasilia":"28°C"}'})], type='TEXT_CONTENT')))
42 --------------------------------------------------
43type='citation-end' index=1
44 --------------------------------------------------
45type='content-end' index=0
46 --------------------------------------------------
47type='message-end' id=None delta=ChatMessageEndEventDelta(finish_reason='COMPLETE', usage=Usage(billed_units=UsageBilledUnits(input_tokens=87.0, output_tokens=19.0, search_units=None, classifications=None), tokens=UsageTokens(input_tokens=1061.0, output_tokens=85.0)))
48 --------------------------------------------------

Usage example

This section provides an example of handling streamed objects in the tool use response generation step.

Setup

First, import the Cohere library and create a client.

PYTHON
1# ! pip install -U cohere
2import cohere
3
4co = cohere.ClientV2(
5 "COHERE_API_KEY"
6) # Get your free API key here: https://dashboard.cohere.com/api-keys

Tool definition

Next, define the tool and its associated schema.

PYTHON
1def get_weather(location):
2 temperature = {
3 "bern": "22°C",
4 "madrid": "24°C",
5 "brasilia": "28°C",
6 }
7 loc = location.lower()
8 if loc in temperature:
9 return [{"temperature": {loc: temperature[loc]}}]
10 return [{"temperature": {loc: "Unknown"}}]
11
12
13functions_map = {"get_weather": get_weather}
14
15tools = [
16 {
17 "type": "function",
18 "function": {
19 "name": "get_weather",
20 "description": "gets the weather of a given location",
21 "parameters": {
22 "type": "object",
23 "properties": {
24 "location": {
25 "type": "string",
26 "description": "the location to get the weather, example: San Francisco.",
27 }
28 },
29 "required": ["location"],
30 },
31 },
32 }
33]

Streaming the response

Before streaming the response, first run through the tool calling and execution steps.

PYTHON
1messages = [
2 {
3 "role": "user",
4 "content": "What's the weather in Madrid and Brasilia?",
5 }
6]
7
8response = co.chat(
9 model="command-r-plus-08-2024", messages=messages, tools=tools
10)
11
12if response.message.tool_calls:
13 messages.append(
14 {
15 "role": "assistant",
16 "tool_plan": response.message.tool_plan,
17 "tool_calls": response.message.tool_calls,
18 }
19 )
20 print(response.message.tool_plan, "\n")
21 print(response.message.tool_calls)
22
23import json
24
25if response.message.tool_calls:
26 for tc in response.message.tool_calls:
27 tool_result = functions_map[tc.function.name](
28 **json.loads(tc.function.arguments)
29 )
30 tool_content = []
31 for data in tool_result:
32 # Optional: the "document" object can take an "id" field for use in citations, otherwise auto-generated
33 tool_content.append(
34 {
35 "type": "document",
36 "document": {"data": json.dumps(data)},
37 }
38 )
39 messages.append(
40 {
41 "role": "tool",
42 "tool_call_id": tc.id,
43 "content": tool_content,
44 }
45 )

Example response:

1I will use the get_weather tool to find the weather in Madrid and Brasilia.
2
3[
4 ToolCallV2(
5 id="get_weather_15c2p6g19s8f",
6 type="function",
7 function=ToolCallV2Function(
8 name="get_weather", arguments='{"location":"Madrid"}'
9 ),
10 ),
11 ToolCallV2(
12 id="get_weather_n01pkywy0p2w",
13 type="function",
14 function=ToolCallV2Function(
15 name="get_weather", arguments='{"location":"Brasilia"}'
16 ),
17 ),
18]

Once the tool results have been received, we can now stream the response using the chat_stream endpoint.

The events are streamed as chunk objects. In the example below, we pick content-delta to display the text response and citation-start to display the citations.

PYTHON
1response = co.chat_stream(
2 model="command-r-plus-08-2024", messages=messages, tools=tools
3)
4
5response_text = ""
6citations = []
7for chunk in response:
8 if chunk:
9 if chunk.type == "content-delta":
10 response_text += chunk.delta.message.content.text
11 print(chunk.delta.message.content.text, end="")
12 if chunk.type == "citation-start":
13 citations.append(chunk.delta.message.citations)
14
15for citation in citations:
16 print(citation, "\n")

Example response:

1It's currently 24°C in Madrid and 28°C in Brasilia.
2
3start=5 end=9 text='24°C' sources=[ToolSource(type='tool', id='get_weather_15c2p6g19s8f:0', tool_output={'temperature': '{"madrid":"24°C"}'})] type='TEXT_CONTENT'
4
5start=24 end=28 text='28°C' sources=[ToolSource(type='tool', id='get_weather_n01pkywy0p2w:0', tool_output={'temperature': '{"brasilia":"28°C"}'})] type='TEXT_CONTENT'
Built with