For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
  • Get Started
    • Introduction
    • Installation
    • Creating a client
    • Playground
    • FAQs
  • Models
    • An Overview of Cohere's Models
    • Aya
    • Embed
    • Rerank
  • Text Generation
    • Introduction to Text Generation at Cohere
    • Using the Chat API
    • Reasoning
    • Image Inputs
    • Streaming Responses
    • Predictable Outputs
    • Advanced Generation Parameters
    • Tool Use
      • Basic usage
      • Usage patterns
      • Parameter types
      • Streaming
      • Citations
    • Tokens and Tokenizers
    • Summarizing Text
    • Safety Modes
  • Embeddings (Vectors, Search, Retrieval)
    • Introduction to Embeddings at Cohere
    • Semantic Search with Embeddings
    • Multimodal Embeddings
    • Batch Embedding Jobs
  • Going to Production
    • API Keys and Rate Limits
    • Going Live
    • Deprecations
    • How Does Cohere's Pricing Work?
  • Integrations
    • Integrating Embedding Models with Other Tools
    • Cohere and LangChain
    • LlamaIndex and Cohere
  • Deployment Options
    • Overview
    • SDK Compatibility
  • Tutorials
    • Cookbooks
    • LLM University
    • Build Things with Cohere!
    • Agentic RAG
    • Cohere on Azure
  • Responsible Use
    • Security
    • Usage Policy
    • Command A Technical Report
    • Command R and Command R+ Model Card
  • Cohere Labs
    • Cohere Labs Acceptable Use Policy
  • More Resources
    • Cohere Toolkit
    • Datasets
    • Improve Cohere Docs
LogoLogodocs
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
On this page
  • Accessing citations
  • Non-streaming
  • Streaming
  • Document ID
  • Citation modes
  • Accurate citations
  • Fast citations
Text GenerationTool Use

Citations for tool use (function calling)

Was this page helpful?
Edit this page
Previous

A Guide to Tokens and Tokenizers

Next
Built with

Accessing citations

The Chat endpoint generates fine-grained citations for its tool use response. This capability is included out-of-the-box with the Command family of models.

The following sections describe how to access the citations in both the non-streaming and streaming modes.

Non-streaming

First, define the tool and its associated schema.

Cohere platform
Private deployment
PYTHON
1# ! pip install -U cohere
2import cohere
3import json
4
5co = cohere.ClientV2(
6 "COHERE_API_KEY"
7) # Get your free API key here: https://dashboard.cohere.com/api-keys
PYTHON
1def get_weather(location):
2 temperature = {
3 "bern": "22°C",
4 "madrid": "24°C",
5 "brasilia": "28°C",
6 }
7 loc = location.lower()
8 if loc in temperature:
9 return [{"temperature": {loc: temperature[loc]}}]
10 return [{"temperature": {loc: "Unknown"}}]
11
12
13functions_map = {"get_weather": get_weather}
14
15tools = [
16 {
17 "type": "function",
18 "function": {
19 "name": "get_weather",
20 "description": "gets the weather of a given location",
21 "parameters": {
22 "type": "object",
23 "properties": {
24 "location": {
25 "type": "string",
26 "description": "the location to get the weather, example: San Francisco.",
27 }
28 },
29 "required": ["location"],
30 },
31 },
32 }
33]

Next, run the tool calling and execution steps.

1messages = [
2 {
3 "role": "user",
4 "content": "What's the weather in Madrid and Brasilia?",
5 }
6]
7
8response = co.chat(
9 model="command-a-plus-05-2026", messages=messages, tools=tools
10)
11
12if response.message.tool_calls:
13 messages.append(response.message)
14
15 for tc in response.message.tool_calls:
16 tool_result = functions_map[tc.function.name](
17 **json.loads(tc.function.arguments)
18 )
19 tool_content = []
20 for data in tool_result:
21 tool_content.append(
22 {
23 "type": "document",
24 "document": {"data": json.dumps(data)},
25 }
26 )
27 messages.append(
28 {
29 "role": "tool",
30 "tool_call_id": tc.id,
31 "content": tool_content,
32 }
33 )

In the non-streaming mode (using chat to generate the model response), the citations are provided in the message.citations field of the response object.

Each citation object contains:

  • start and end: the start and end indices of the text that cites a source(s)
  • text: its corresponding span of text
  • sources: the source(s) that it references
1response = co.chat(
2 model="command-a-plus-05-2026", messages=messages, tools=tools
3)
4
5messages.append(
6 {"role": "assistant", "content": response.message.content[0].text}
7)
8
9print(response.message.content[0].text)
10
11for citation in response.message.citations:
12 print(citation, "\n")

Example response:

1It is currently 24°C in Madrid and 28°C in Brasilia.
2
3start=16 end=20 text='24°C' sources=[ToolSource(type='tool', id='get_weather_14brd1n2kfqj:0', tool_output={'temperature': '{"madrid":"24°C"}'})] type='TEXT_CONTENT'
4
5start=35 end=39 text='28°C' sources=[ToolSource(type='tool', id='get_weather_vdr9cvj619fk:0', tool_output={'temperature': '{"brasilia":"28°C"}'})] type='TEXT_CONTENT'

Streaming

In a streaming scenario (using chat_stream to generate the model response), the citations are provided in the citation-start events.

Each citation object contains the same fields as the non-streaming scenario.

1response = co.chat_stream(
2 model="command-a-plus-05-2026", messages=messages, tools=tools
3)
4
5response_text = ""
6citations = []
7for chunk in response:
8 if chunk:
9 if chunk.type == "content-delta":
10 response_text += chunk.delta.message.content.text
11 print(chunk.delta.message.content.text, end="")
12 if chunk.type == "citation-start":
13 citations.append(chunk.delta.message.citations)
14
15messages.append({"role": "assistant", "content": response_text})
16
17for citation in citations:
18 print(citation, "\n")

Example response:

1It is currently 24°C in Madrid and 28°C in Brasilia.
2
3start=16 end=20 text='24°C' sources=[ToolSource(type='tool', id='get_weather_dkf0akqdazjb:0', tool_output={'temperature': '{"madrid":"24°C"}'})] type='TEXT_CONTENT'
4
5start=35 end=39 text='28°C' sources=[ToolSource(type='tool', id='get_weather_gh65bt2tcdy1:0', tool_output={'temperature': '{"brasilia":"28°C"}'})] type='TEXT_CONTENT'

Document ID

When passing the tool results from the tool execution step, you can optionally add custom IDs to the id field in the document object. These IDs will be used by the endpoint as the citation reference.

If you don’t provide the id field, the ID will be auto-generated in the the format of <tool_call_id>:<auto_generated_id>. Example: get_weather_1byjy32y4hvq:0.

Here is an example of using custom IDs. To keep it concise, let’s start with a pre-defined list of messages with the user query, tool calling, and tool results are already available.

PYTHON
1# ! pip install -U cohere
2import cohere
3import json
4
5co = cohere.ClientV2(
6 "COHERE_API_KEY"
7) # Get your free API key here: https://dashboard.cohere.com/api-keys
8
9messages = [
10 {
11 "role": "user",
12 "content": "What's the weather in Madrid and Brasilia?",
13 },
14 {
15 "role": "assistant",
16 "tool_plan": "I will search for the weather in Madrid and Brasilia.",
17 "tool_calls": [
18 {
19 "id": "get_weather_dkf0akqdazjb",
20 "type": "function",
21 "function": {
22 "name": "get_weather",
23 "arguments": '{"location":"Madrid"}',
24 },
25 },
26 {
27 "id": "get_weather_gh65bt2tcdy1",
28 "type": "function",
29 "function": {
30 "name": "get_weather",
31 "arguments": '{"location":"Brasilia"}',
32 },
33 },
34 ],
35 },
36 {
37 "role": "tool",
38 "tool_call_id": "get_weather_dkf0akqdazjb",
39 "content": [
40 {
41 "type": "document",
42 "document": {
43 "data": '{"temperature": {"madrid": "24°C"}}',
44 "id": "1",
45 },
46 }
47 ],
48 },
49 {
50 "role": "tool",
51 "tool_call_id": "get_weather_gh65bt2tcdy1",
52 "content": [
53 {
54 "type": "document",
55 "document": {
56 "data": '{"temperature": {"brasilia": "28°C"}}',
57 "id": "2",
58 },
59 }
60 ],
61 },
62]

When document IDs are provided, the citation will refer to the documents using these IDs.

1response = co.chat(
2 model="command-a-plus-05-2026", messages=messages, tools=tools
3)
4
5print(response.message.content[0].text)
6
7for citation in response.message.citations:
8 print(citation, "\n")

Note the id fields in the citations, which refer to the IDs in the document object.

Example response:

1It's 24°C in Madrid and 28°C in Brasilia.
2
3start=5 end=9 text='24°C' sources=[ToolSource(type='tool', id='1', tool_output={'temperature': '{"madrid":"24°C"}'})] type='TEXT_CONTENT'
4
5start=24 end=28 text='28°C' sources=[ToolSource(type='tool', id='2', tool_output={'temperature': '{"brasilia":"28°C"}'})] type='TEXT_CONTENT'

In contrast, here’s an example citation when the IDs are not provided.

Example response:

1It is currently 24°C in Madrid and 28°C in Brasilia.
2
3start=16 end=20 text='24°C' sources=[ToolSource(type='tool', id='get_weather_dkf0akqdazjb:0', tool_output={'temperature': '{"madrid":"24°C"}'})] type='TEXT_CONTENT'
4
5start=35 end=39 text='28°C' sources=[ToolSource(type='tool', id='get_weather_gh65bt2tcdy1:0', tool_output={'temperature': '{"brasilia":"28°C"}'})] type='TEXT_CONTENT'

Citation modes

When running tool use in streaming mode, it’s possible to configure how citations are generated and presented. You can choose between fast citations or accurate citations, depending on your latency and precision needs.

Accurate citations

The model produces its answer first, and then, after the entire response is generated, it provides citations that map to specific segments of the response text. This approach may incur slightly higher latency, but it ensures the citation indices are more precisely aligned with the final text segments of the model’s answer.

This is the default option, or you can explicitly specify it by adding the citation_options={"mode": "accurate"} argument in the API call.

Here is an example using the same list of pre-defined messages as the above.

With the citation_options mode set to accurate, we get the citations after the entire response is generated.

1# ! pip install -U cohere
2import cohere
3import json
4
5co = cohere.ClientV2(
6 "COHERE_API_KEY"
7) # Get your free API key here: https://dashboard.cohere.com/api-keys
8
9response = co.chat_stream(
10 model="command-a-plus-05-2026",
11 messages=messages,
12 tools=tools,
13 citation_options={"mode": "accurate"},
14)
15
16response_text = ""
17citations = []
18for chunk in response:
19 if chunk:
20 if chunk.type == "content-delta":
21 response_text += chunk.delta.message.content.text
22 print(chunk.delta.message.content.text, end="")
23 if chunk.type == "citation-start":
24 citations.append(chunk.delta.message.citations)
25
26print("\n")
27for citation in citations:
28 print(citation, "\n")

Example response:

1It is currently 24°C in Madrid and 28°C in Brasilia.
2
3start=16 end=20 text='24°C' sources=[ToolSource(type='tool', id='1', tool_output={'temperature': '{"madrid":"24°C"}'})] type='TEXT_CONTENT'
4
5start=35 end=39 text='28°C' sources=[ToolSource(type='tool', id='2', tool_output={'temperature': '{"brasilia":"28°C"}'})] type='TEXT_CONTENT'

Fast citations

The model generates citations inline, as the response is being produced. In streaming mode, you will see citations injected at the exact moment the model uses a particular piece of external context. This approach provides immediate traceability at the expense of slightly less precision in citation relevance.

You can specify it by adding the citation_options={"mode": "fast"} argument in the API call.

With the citation_options mode set to fast, we get the citations inline as the model generates the response.

1response = co.chat_stream(
2 model="command-a-plus-05-2026",
3 messages=messages,
4 tools=tools,
5 citation_options={"mode": "fast"},
6)
7
8response_text = ""
9for chunk in response:
10 if chunk:
11 if chunk.type == "content-delta":
12 response_text += chunk.delta.message.content.text
13 print(chunk.delta.message.content.text, end="")
14 if chunk.type == "citation-start":
15 print(
16 f" [{chunk.delta.message.citations.sources[0].id}]",
17 end="",
18 )

Example response:

1It is currently 24°C [1] in Madrid and 28°C [2] in Brasilia.