Streaming for tool use (function calling)
Overview
To enable response streaming in tool use, use the chat_stream
endpoint instead of chat
.
You can stream responses in both the tool calling and the response generation steps. This allows your application to receive token streams as the model plans and executes tool calls and finally generates its response.
Events stream
In tool use, the events streamed by the endpoint follows the structure of a basic chat stream event but contains additional events for tool calling and response generation with the associated contents. This section describes the stream of events and their contents.
Tool calling step
Event types
message-start
Same as in a basic chat stream event.
tool-plan-delta
Emitted when the next token of the tool plan is generated.
tool-call-start
Emitted when the model generates tool calls that require actioning upon. The event contains a list of tool_calls
containing the tool name and tool call ID of the tool.
tool-call-delta
Emitted when the next token of the the tool call is generated.
tool-call-end
Emitted when the tool call is finished.
When there are more than one tool calls being made (i.e. parallel tool calls), the sequence of tool-call-start
, tool-call-delta
, and tool-call-end
events will repeat.
message-end
Same as in a basic chat stream event.
Example stream
The following is an example stream in the tool calling step.
1 # User message 2 "What's the weather in Madrid and Brasilia?" 3 4 # Events stream 5 type='message-start' id='fba98ad3-e5a1-413c-a8de-84fbf9baabf7' delta=ChatMessageStartEventDelta(message=ChatMessageStartEventDeltaMessage(role='assistant', content=[], tool_plan='', tool_calls=[], citations=[])) 6 -------------------------------------------------- 7 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan='I')) 8 -------------------------------------------------- 9 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' will')) 10 -------------------------------------------------- 11 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' search')) 12 -------------------------------------------------- 13 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' for')) 14 -------------------------------------------------- 15 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' the')) 16 -------------------------------------------------- 17 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' weather')) 18 -------------------------------------------------- 19 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' in')) 20 -------------------------------------------------- 21 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' Madrid')) 22 -------------------------------------------------- 23 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' and')) 24 -------------------------------------------------- 25 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan=' Brasilia')) 26 -------------------------------------------------- 27 type='tool-plan-delta' delta=ChatToolPlanDeltaEventDelta(message=ChatToolPlanDeltaEventDeltaMessage(tool_plan='.')) 28 -------------------------------------------------- 29 type='tool-call-start' index=0 delta=ChatToolCallStartEventDelta(message=ChatToolCallStartEventDeltaMessage(tool_calls=ToolCallV2(id='get_weather_p1t92w7gfgq7', type='function', function=ToolCallV2Function(name='get_weather', arguments='')))) 30 -------------------------------------------------- 31 type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='{\n "')))) 32 -------------------------------------------------- 33 type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='location')))) 34 -------------------------------------------------- 35 type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='":')))) 36 -------------------------------------------------- 37 type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments=' "')))) 38 -------------------------------------------------- 39 type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='Madrid')))) 40 -------------------------------------------------- 41 type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='"')))) 42 -------------------------------------------------- 43 type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='\n')))) 44 -------------------------------------------------- 45 type='tool-call-delta' index=0 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='}')))) 46 -------------------------------------------------- 47 type='tool-call-end' index=0 48 -------------------------------------------------- 49 type='tool-call-start' index=1 delta=ChatToolCallStartEventDelta(message=ChatToolCallStartEventDeltaMessage(tool_calls=ToolCallV2(id='get_weather_ay6nmvjgp9vn', type='function', function=ToolCallV2Function(name='get_weather', arguments='')))) 50 -------------------------------------------------- 51 type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='{\n "')))) 52 -------------------------------------------------- 53 type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='location')))) 54 -------------------------------------------------- 55 type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='":')))) 56 -------------------------------------------------- 57 type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments=' "')))) 58 -------------------------------------------------- 59 type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='Bras')))) 60 -------------------------------------------------- 61 type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='ilia')))) 62 -------------------------------------------------- 63 type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='"')))) 64 -------------------------------------------------- 65 type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='\n')))) 66 -------------------------------------------------- 67 type='tool-call-delta' index=1 delta=ChatToolCallDeltaEventDelta(message=ChatToolCallDeltaEventDeltaMessage(tool_calls=ChatToolCallDeltaEventDeltaMessageToolCalls(function=ChatToolCallDeltaEventDeltaMessageToolCallsFunction(arguments='}')))) 68 -------------------------------------------------- 69 type='tool-call-end' index=1 70 -------------------------------------------------- 71 type='message-end' id=None delta=ChatMessageEndEventDelta(finish_reason='TOOL_CALL', usage=Usage(billed_units=UsageBilledUnits(input_tokens=37.0, output_tokens=28.0, search_units=None, classifications=None), tokens=UsageTokens(input_tokens=913.0, output_tokens=83.0))) 72 --------------------------------------------------
Response generation step
Event types
message-start
Same as in a basic chat stream event.
content-start
Same as in a basic chat stream event.
content-delta
Same as in a basic chat stream event.
citation-start
Emitted for every citation generated in the response. This event contains the details about a citation such as the start
and end
indices of the text that cites a source(s), the corresponding text
, and the list of sources
.
citation-end
Emitted to indicate the end of a citation. If there are multiple citations generated, the events will come as a sequence of citation-start
and citation-end
pairs.
content-end
Same as in a basic chat stream event.
message-end
Same as in a basic chat stream event.
Example stream
The following is an example stream in the response generation step.
1 "What's the weather in Madrid and Brasilia?" 2 3 type='message-start' id='e8f9afc1-0888-46f0-a9ed-eb0e5a51e17f' delta=ChatMessageStartEventDelta(message=ChatMessageStartEventDeltaMessage(role='assistant', content=[], tool_plan='', tool_calls=[], citations=[])) 4 -------------------------------------------------- 5 type='content-start' index=0 delta=ChatContentStartEventDelta(message=ChatContentStartEventDeltaMessage(content=ChatContentStartEventDeltaMessageContent(text='', type='text'))) 6 -------------------------------------------------- 7 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='It'))) logprobs=None 8 -------------------------------------------------- 9 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' is'))) logprobs=None 10 -------------------------------------------------- 11 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' currently'))) logprobs=None 12 -------------------------------------------------- 13 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' 2'))) logprobs=None 14 -------------------------------------------------- 15 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='4'))) logprobs=None 16 -------------------------------------------------- 17 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='°'))) logprobs=None 18 -------------------------------------------------- 19 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='C in'))) logprobs=None 20 -------------------------------------------------- 21 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' Madrid'))) logprobs=None 22 -------------------------------------------------- 23 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' and'))) logprobs=None 24 -------------------------------------------------- 25 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' 2'))) logprobs=None 26 -------------------------------------------------- 27 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='8'))) logprobs=None 28 -------------------------------------------------- 29 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='°'))) logprobs=None 30 -------------------------------------------------- 31 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='C in'))) logprobs=None 32 -------------------------------------------------- 33 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text=' Brasilia'))) logprobs=None 34 -------------------------------------------------- 35 type='content-delta' index=0 delta=ChatContentDeltaEventDelta(message=ChatContentDeltaEventDeltaMessage(content=ChatContentDeltaEventDeltaMessageContent(text='.'))) logprobs=None 36 -------------------------------------------------- 37 type='citation-start' index=0 delta=CitationStartEventDelta(message=CitationStartEventDeltaMessage(citations=Citation(start=16, end=20, text='24°C', sources=[ToolSource(type='tool', id='get_weather_m3kdvxncg1p8:0', tool_output={'temperature': '{"madrid":"24°C"}'})], type='TEXT_CONTENT'))) 38 -------------------------------------------------- 39 type='citation-end' index=0 40 -------------------------------------------------- 41 type='citation-start' index=1 delta=CitationStartEventDelta(message=CitationStartEventDeltaMessage(citations=Citation(start=35, end=39, text='28°C', sources=[ToolSource(type='tool', id='get_weather_cfwfh3wzkbrs:0', tool_output={'temperature': '{"brasilia":"28°C"}'})], type='TEXT_CONTENT'))) 42 -------------------------------------------------- 43 type='citation-end' index=1 44 -------------------------------------------------- 45 type='content-end' index=0 46 -------------------------------------------------- 47 type='message-end' id=None delta=ChatMessageEndEventDelta(finish_reason='COMPLETE', usage=Usage(billed_units=UsageBilledUnits(input_tokens=87.0, output_tokens=19.0, search_units=None, classifications=None), tokens=UsageTokens(input_tokens=1061.0, output_tokens=85.0))) 48 --------------------------------------------------
Usage example
This section provides an example of handling streamed objects in the tool use response generation step.
Setup
First, import the Cohere library and create a client.
Cohere platform
Private deployment
1 # ! pip install -U cohere 2 import cohere 3 4 co = cohere.ClientV2( 5 "COHERE_API_KEY" 6 ) # Get your free API key here: https://dashboard.cohere.com/api-keys
Tool definition
Next, define the tool and its associated schema.
1 def get_weather(location): 2 temperature = { 3 "bern": "22°C", 4 "madrid": "24°C", 5 "brasilia": "28°C", 6 } 7 loc = location.lower() 8 if loc in temperature: 9 return [{"temperature": {loc: temperature[loc]}}] 10 return [{"temperature": {loc: "Unknown"}}] 11 12 13 functions_map = {"get_weather": get_weather} 14 15 tools = [ 16 { 17 "type": "function", 18 "function": { 19 "name": "get_weather", 20 "description": "gets the weather of a given location", 21 "parameters": { 22 "type": "object", 23 "properties": { 24 "location": { 25 "type": "string", 26 "description": "the location to get the weather, example: San Francisco.", 27 } 28 }, 29 "required": ["location"], 30 }, 31 }, 32 } 33 ]
Streaming the response
Before streaming the response, first run through the tool calling and execution steps.
1 messages = [ 2 { 3 "role": "user", 4 "content": "What's the weather in Madrid and Brasilia?", 5 } 6 ] 7 8 response = co.chat( 9 model="command-r-plus-08-2024", messages=messages, tools=tools 10 ) 11 12 if response.message.tool_calls: 13 messages.append( 14 { 15 "role": "assistant", 16 "tool_plan": response.message.tool_plan, 17 "tool_calls": response.message.tool_calls, 18 } 19 ) 20 print(response.message.tool_plan, "\n") 21 print(response.message.tool_calls) 22 23 import json 24 25 if response.message.tool_calls: 26 for tc in response.message.tool_calls: 27 tool_result = functions_map[tc.function.name]( 28 **json.loads(tc.function.arguments) 29 ) 30 tool_content = [] 31 for data in tool_result: 32 # Optional: the "document" object can take an "id" field for use in citations, otherwise auto-generated 33 tool_content.append( 34 { 35 "type": "document", 36 "document": {"data": json.dumps(data)}, 37 } 38 ) 39 messages.append( 40 { 41 "role": "tool", 42 "tool_call_id": tc.id, 43 "content": tool_content, 44 } 45 )
Example response:
1 I will use the get_weather tool to find the weather in Madrid and Brasilia. 2 3 [ 4 ToolCallV2( 5 id="get_weather_15c2p6g19s8f", 6 type="function", 7 function=ToolCallV2Function( 8 name="get_weather", arguments='{"location":"Madrid"}' 9 ), 10 ), 11 ToolCallV2( 12 id="get_weather_n01pkywy0p2w", 13 type="function", 14 function=ToolCallV2Function( 15 name="get_weather", arguments='{"location":"Brasilia"}' 16 ), 17 ), 18 ]
Once the tool results have been received, we can now stream the response using the chat_stream
endpoint.
The events are streamed as chunk
objects. In the example below, we pick content-delta
to display the text response and citation-start
to display the citations.
1 response = co.chat_stream( 2 model="command-r-plus-08-2024", messages=messages, tools=tools 3 ) 4 5 response_text = "" 6 citations = [] 7 for chunk in response: 8 if chunk: 9 if chunk.type == "content-delta": 10 response_text += chunk.delta.message.content.text 11 print(chunk.delta.message.content.text, end="") 12 if chunk.type == "citation-start": 13 citations.append(chunk.delta.message.citations) 14 15 for citation in citations: 16 print(citation, "\n")
Example response:
1 It's currently 24°C in Madrid and 28°C in Brasilia. 2 3 start=5 end=9 text='24°C' sources=[ToolSource(type='tool', id='get_weather_15c2p6g19s8f:0', tool_output={'temperature': '{"madrid":"24°C"}'})] type='TEXT_CONTENT' 4 5 start=24 end=28 text='28°C' sources=[ToolSource(type='tool', id='get_weather_n01pkywy0p2w:0', tool_output={'temperature': '{"brasilia":"28°C"}'})] type='TEXT_CONTENT'