Migrating from the Generate API to the Chat API

Users of Amazon Sagemaker, Amazon Bedrock, and Oracle Cloud Infrastructure (OCI) don’t need to migrate. Please refer to platform-specific documentation for recommended usage of Cohere Command models.

With our newest planned updates, Generate will be relegated to legacy status. It will still be available for use, but will no longer be updated with new features.

In order to use Cohere generative functionality, we recommend using the Chat endpoint. This guide outlines how to migrate from Generate to Chat in order to get improved performance and to eliminate any potential interruptions.

Overview

The difference between Chat and Generate is that the Chat endpoint adds a default preamble to the user prompt that improves the quality of the model’s output.

Here’s an example:

PYTHON
1# BEFORE
2co.generate(prompt="Write me three bullet points for my resume")
3
4# AFTER
5co.chat(message="Write me three bullet points for my resume")

Unsupported Parameters

The following parameters were previously available in Generate but are not supported by Chat.

  • num_generations: To achieve the same outcome as num_generations=n in Chat, please call co.chat() n times.
  • stop_sequences and end_sequences: Going forward, we ask users to trim model outputs on their side instead of setting a stop sequence.
  • return_likelihoods: This is not supported in the Chat endpoint.
  • logit_bias: This is not supported in the Chat endpoint.
  • truncate: This is not supported in the Chat endpoint.
  • preset: This is not supported in the Chat endpoint. Please create and store presets on your end instead of storing them via our endpoints.

Example for Migrating from Generate to Chat

Here are some steps you can take to ensure that your migration goes smoothly:

  • Ensure that you’re using the message parameter instead of the prompt parameter. The primary way of communicating with the Chat API is via message. Going forward, send the contents of your prompt through message and not through prompt.
  • No changes have been made to k, p, frequency_penalty, presence_penalty, max_tokens, stream, or temperature, so those should behave as expected.

Fine-tuned Models

Models that were fine-tuned to use the Generate API will work with the Chat API. Remember not to use the chat_history parameter, as this parameter is only supported for models fine-tuned for Chat.

We will not delete or disable the Generate endpoint, but we suggest fine-tuning models for use with the Chat endpoint in the future.

FAQs About Migration

When will the generate endpoint stop being supported?

At this time, we will still support requests to Generate but we will not be making feature updates. For this reason, the Generate is being marked as a legacy API endpoint.