Migrating from the Generate API to the Chat API

The Generate API is slated for deprecation on Aug 26, 2025.

In order to use Cohere generative functionality, we recommend using the Chat endpoint. This guide outlines how to migrate from Generate to Chat in order to get improved performance and to eliminate any potential interruptions.

Overview

While the Generate endpoint works with raw prompts, the Chat endpoint is designed for a conversational interface between a User and an Assistant.

Here’s an example:

PYTHON

1 import cohere
2 
3 co = cohere.ClientV2()
4 
5 # BEFORE
6 co.generate(prompt="Write me three bullet points for my resume")
7 
8 # AFTER
9 co.chat(
10     model="command-a-03-2025",
11     messages=[
12         {
13             "role": "user",
14             "content": "Write me three bullet points for my resume",
15         }
16     ],
17 )

Unsupported Features

The following parameters were previously available in Generate but are not supported by Chat.

num_generations: To achieve the same outcome as num_generations=n in Chat, please call co.chat() n times.
stop_sequences and end_sequences: Going forward, we ask users to trim model outputs on their side instead of setting a stop sequence.
logit_bias: This is not supported in the Chat endpoint.
truncate: This is not supported in the Chat endpoint.
preset: This is not supported in the Chat endpoint. Please create and store presets on your end instead of storing them via our endpoints.

Example for Migrating from Generate to Chat

Here are some steps you can take to ensure that your migration goes smoothly:

Ensure that you’re using the message parameter instead of the prompt parameter. The primary way of communicating with the Chat API is via message. Going forward, send the contents of your prompt through message and not through prompt.
No changes have been made to k, p, frequency_penalty, presence_penalty, max_tokens, stream, or temperature, so those should behave as expected.

Fine-tuned Models

Models that were fine-tuned to use the Generate API will work with the Chat API. Remember not to use the chat_history parameter, as this parameter is only supported for models fine-tuned for Chat.

We will not delete or disable the Generate endpoint, but we suggest fine-tuning models for use with the Chat endpoint in the future.