The Compatibility API allows developers to use Cohere’s models through OpenAI’s SDK.
It makes it easy to switch existing OpenAI-based applications to use Cohere’s models while still maintaining the use of OpenAI SDK — no big refactors needed.
The supported libraries are:
This is a quickstart guide to help you get started with the Compatibility API.
First, install the OpenAI SDK and import the package.
Then, create a client and configure it with the compatibility API base URL and your Cohere API key.
Here’s a basic example of using the Chat Completions API.
Example response (via the Python SDK):
To stream the response, set the stream parameter to True.
Example response (via the Python SDK):
For state management, use the messages parameter to build the conversation history.
You can include a system message via the developer role and the multiple chat turns between the user and assistant.
Example response (via the Python SDK):
The Structured Outputs feature allows you to specify the schema of the model response. It guarantees that the response will strictly follow the schema.
To use it, set the response_format parameter to the JSON Schema of the desired output.
Example response (via the Python SDK):
You can utilize the tool use feature by passing a list of tools to the tools parameter in the API call.
Specifying the strict parameter to True in the tool calling step will guarantee that every generated tool call follows the specified tool schema.
Example response (via the Python SDK):
You can generate text embeddings Embeddings API by passing a list of strings as the input parameter. You can also specify in encoding_format the format of embeddings to be generated. Can be either float or base64.
Example response (via the Python SDK):
You can pass an audio file to the Audio Transcriptions API to to create a transcription of the audio for files up to 25MB in size.
The following is the list of supported parameters in the Compatibility API, including those that are not explicitly demonstrated in the examples above:
modelmessagesstreamreasoning_effort (Only “none” and “high” are currently supported.)response_formattoolstemperaturemax_tokensstopseedtop_pfrequency_penaltypresence_penaltyCurrently, only none and high are supported for reasoning_effort.
These correspond to enabling or disabling thinking in the Cohere Chat API.
Passing medium or low is not supported at this time.
inputmodelencoding_formatmodel (required)language (required)file (required, must be the last parameter in the HTTP form-data request)response_format (only “json” is supported)temperaturePlease take note the following:
language is required in the Cohere Audio Transcriptions API but optional in the OpenAI Audio
Transcriptions API.file must be the last parameter in the cURL call.The following parameters are not supported in the Compatibility API:
storemetadatalogit_biastop_logprobsnmodalitiespredictionaudioservice_tierparallel_tool_callsdimensionsuserstreamprompttimestamp_granularitieschunking_strategyincludeknown_speaker_namesknown_speaker_referencesParameters that are uniquely available on the Cohere API but not on the OpenAI SDK are not supported.
Chat endpoint:
connectorsdocumentscitation_optionsEmbed endpoint:
input_typeimagestruncate