Using the Cohere Chat API for Text Generation

The Chat API endpoint is used to generate text with Cohere LLMs. This endpoint facilitates a conversational interface, allowing users to send messages to the model and receive text responses.

1import cohere
2
3co = cohere.Client(api_key="<YOUR API KEY>")
4
5response = co.chat(
6 model="command-r-plus-08-2024",
7 message="Write a title for a blog post about API design. Only output the title text.",
8)
9
10print(response.text)
11# "The Art of API Design: Crafting Elegant and Powerful Interfaces"

Response Structure

Below is a sample response from the Chat API

JSON
1{
2 "text": "The Art of API Design: Crafting Elegant and Powerful Interfaces",
3 "generation_id": "dd78b9fe-988b-4c18-9419-8fbdf9968948",
4 "chat_history": [
5 {
6 "role": "USER",
7 "message": "Write a title for a blog post about API design. Only output the title text."
8 },
9 {
10 "role": "CHATBOT",
11 "message": "The Art of API Design: Crafting Elegant and Powerful Interfaces"
12 }
13 ],
14 "finish_reason": "COMPLETE",
15 "meta": {
16 "api_version": {
17 "version": "1"
18 },
19 "billed_units": {
20 "input_tokens": 17,
21 "output_tokens": 12
22 },
23 "tokens": {
24 "input_tokens": 83,
25 "output_tokens": 12
26 }
27 }
28}

Every response contains the following fields:

  • text the generated message from the model.
  • generation_id the ID corresponding to this response. Can be used together with the Feedback API endpoint to promote great responses and flag bad ones.
  • chat_history the conversation presented in a chat log format
  • finish_reason can be one of the following:
    • COMPLETE the model successfully finished generating the message
    • MAX_TOKENS the model’s context limit was reached before the generation could be completed
  • meta contains information with token counts, billing etc.

Multi-turn conversations

The user message in the Chat request can be sent together with a chat_history to provide the model with conversational context:

PYTHON
1import cohere
2
3co = cohere.Client(api_key="<YOUR API KEY>")
4
5message = "Can you tell me about LLMs?"
6
7response = co.chat(
8 model="command-r-plus-08-2024",
9 chat_history=[
10 {"role": "USER", "text": "Hey, my name is Michael!"},
11 {
12 "role": "CHATBOT",
13 "text": "Hey Michael! How can I help you today?",
14 },
15 ],
16 message=message,
17)
18
19print(response.text) # "Sure thing Michael, LLMs are ..."

Instead of manually building the chat_history, we can grab it from the response of the previous turn.

PYTHON
1chat_history = []
2max_turns = 10
3
4for _ in range(max_turns):
5 # get user input
6 message = input("Send the model a message: ")
7
8 # generate a response with the current chat history
9 response = co.chat(
10 model="command-r-plus-08-2024",
11 message=message,
12 chat_history=chat_history,
13 )
14
15 # print the model's response on this turn
16 print(response.text)
17
18 # set the chat history for next turn
19 chat_history = response.chat_history

Using conversation_id to Save Chat History

Providing the model with the conversation history is one way to have a multi-turn conversation with the model. Cohere has developed another option for users who do not wish to save the conversation history, and it works through a user-defined conversation_id.

PYTHON
1import cohere
2
3co = cohere.Client("<YOUR API KEY>")
4
5response = co.chat(
6 model="command-r-plus-08-2024",
7 message="The secret word is 'fish', remember that.",
8 conversation_id="user_defined_id_1",
9)
10
11answer = response.text

Then, if you wanted to continue the conversation, you could do so like this (keeping the id consistent):

PYTHON
1response2 = co.chat(
2 model="command-r-plus-08-2024",
3 message="What is the secret word?",
4 conversation_id="user_defined_id_1",
5)
6
7print(response2.text) # "The secret word is 'fish'"

Note that the conversation_id should not be used in conjunction with the chat_history. They are mutually exclusive.

Built with