Using the Cohere Chat API for Text Generation

The Chat API endpoint is used to generate text with Cohere LLMs. This endpoint facilitates a conversational interface, allowing users to send messages to the model and receive text responses.

1 import cohere
2 
3 co = cohere.Client(api_key="<YOUR API KEY>")
4 
5 response = co.chat(
6     model="command-a-03-2025",
7     message="Write a title for a blog post about API design. Only output the title text.",
8 )
9 
10 print(response.text)
11 # "The Art of API Design: Crafting Elegant and Powerful Interfaces"

Response Structure

Below is a sample response from the Chat API

JSON

1 {
2     "text": "The Art of API Design: Crafting Elegant and Powerful Interfaces",
3     "generation_id": "dd78b9fe-988b-4c18-9419-8fbdf9968948",
4     "chat_history": [
5         {
6             "role": "USER",
7             "message": "Write a title for a blog post about API design. Only output the title text."
8         },
9         {
10             "role": "CHATBOT",
11             "message": "The Art of API Design: Crafting Elegant and Powerful Interfaces"
12         }
13     ],
14     "finish_reason": "COMPLETE",
15     "meta": {
16         "api_version": {
17             "version": "1"
18         },
19         "billed_units": {
20             "input_tokens": 17,
21             "output_tokens": 12
22         },
23         "tokens": {
24             "input_tokens": 83,
25             "output_tokens": 12
26         }
27     }
28 }

Every response contains the following fields:

text the generated message from the model.
generation_id the ID corresponding to this response. Can be used together with the Feedback API endpoint to promote great responses and flag bad ones.
chat_history the conversation presented in a chat log format
finish_reason can be one of the following:
- COMPLETE the model successfully finished generating the message
- MAX_TOKENS the model’s context limit was reached before the generation could be completed
meta contains information with token counts, billing etc.

Multi-turn conversations

The user message in the Chat request can be sent together with a chat_history to provide the model with conversational context:

PYTHON

1 import cohere
2 
3 co = cohere.Client(api_key="<YOUR API KEY>")
4 
5 message = "Can you tell me about LLMs?"
6 
7 response = co.chat(
8     model="command-a-03-2025",
9     chat_history=[
10         {"role": "USER", "text": "Hey, my name is Michael!"},
11         {
12             "role": "CHATBOT",
13             "text": "Hey Michael! How can I help you today?",
14         },
15     ],
16     message=message,
17 )
18 
19 print(response.text)  # "Sure thing Michael, LLMs are ..."

Instead of manually building the chat_history, we can grab it from the response of the previous turn.

PYTHON

1 chat_history = []
2 max_turns = 10
3 
4 for _ in range(max_turns):
5     # get user input
6     message = input("Send the model a message: ")
7 
8     # generate a response with the current chat history
9     response = co.chat(
10         model="command-a-03-2025",
11         message=message,
12         chat_history=chat_history,
13     )
14 
15     # print the model's response on this turn
16     print(response.text)
17 
18     # set the chat history for next turn
19     chat_history = response.chat_history

Using `conversation_id` to Save Chat History

Providing the model with the conversation history is one way to have a multi-turn conversation with the model. Cohere has developed another option for users who do not wish to save the conversation history, and it works through a user-defined conversation_id.

PYTHON

1 import cohere
2 
3 co = cohere.Client("<YOUR API KEY>")
4 
5 response = co.chat(
6     model="command-a-03-2025",
7     message="The secret word is 'fish', remember that.",
8     conversation_id="user_defined_id_1",
9 )
10 
11 answer = response.text

Then, if you wanted to continue the conversation, you could do so like this (keeping the id consistent):

PYTHON

1 response2 = co.chat(
2     model="command-a-03-2025",
3     message="What is the secret word?",
4     conversation_id="user_defined_id_1",
5 )
6 
7 print(response2.text)  # "The secret word is 'fish'"

Note that the conversation_id should not be used in conjunction with the chat_history. They are mutually exclusive.

Response Structure

Multi-turn conversations

Using conversation_id to Save Chat History

Using `conversation_id` to Save Chat History