Using Cohere models via the OpenAI SDK

The Compatibility API allows developers to use Cohere’s models through OpenAI’s SDK.

It makes it easy to switch existing OpenAI-based applications to use Cohere’s models while still maintaining the use of OpenAI SDK — no big refactors needed.

The supported libraries are:

TypeScript / JavaScript
Python
.NET
Java (beta)
Go (beta)

This is a quickstart guide to help you get started with the Compatibility API.

Installation

First, install the OpenAI SDK and import the package.

Then, create a client and configure it with the compatibility API base URL and your Cohere API key.

Python

TypeScript

$ pip install openai

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )

Basic chat completions

Here’s a basic example of using the Chat Completions API.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 completion = client.chat.completions.create(
9     model="command-a-03-2025",
10     messages=[
11         {
12             "role": "user",
13             "content": "Write a haiku about recursion in programming.",
14         },
15     ],
16 )
17 
18 print(completion.choices[0].message)

Example response (via the Python SDK):

1 ChatCompletionMessage(content="Recursive loops,\nUnraveling code's depths,\nEndless, yet complete.", refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None)

Chat with streaming

To stream the response, set the stream parameter to True.

Python

TypeScript

cURL

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 stream = client.chat.completions.create(
9     model="command-a-03-2025",
10     messages=[
11         {
12             "role": "user",
13             "content": "Write a haiku about recursion in programming.",
14         },
15     ],
16     stream=True,
17 )
18 
19 for chunk in stream:
20     print(chunk.choices[0].delta.content or "", end="")

Example response (via the Python SDK):

1 Recursive call,
2 Unraveling, line by line,
3 Solving, then again.

State management

For state management, use the messages parameter to build the conversation history.

You can include a system message via the developer role and the multiple chat turns between the user and assistant.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 completion = client.chat.completions.create(
9     messages=[
10         {
11             "role": "developer",
12             "content": "You must respond in the style of a pirate.",
13         },
14         {
15             "role": "user",
16             "content": "What's 2 + 2.",
17         },
18         {
19             "role": "assistant",
20             "content": "Arrr, matey! 2 + 2 be 4, just like a doubloon in the sea!",
21         },
22         {
23             "role": "user",
24             "content": "Add 30 to that.",
25         },
26     ],
27     model="command-a-03-2025",
28 )
29 
30 print(completion.choices[0].message)

Example response (via the Python SDK):

1 ChatCompletionMessage(content='Aye aye, captain! 4 + 30 be 34, a treasure to behold!', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None)

Structured outputs

The Structured Outputs feature allows you to specify the schema of the model response. It guarantees that the response will strictly follow the schema.

To use it, set the response_format parameter to the JSON Schema of the desired output.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 completion = client.beta.chat.completions.parse(
9     model="command-a-03-2025",
10     messages=[
11         {
12             "role": "user",
13             "content": "Generate a JSON describing a book.",
14         }
15     ],
16     response_format={
17         "type": "json_object",
18         "schema": {
19             "type": "object",
20             "properties": {
21                 "title": {"type": "string"},
22                 "author": {"type": "string"},
23                 "publication_year": {"type": "integer"},
24             },
25             "required": ["title", "author", "publication_year"],
26         },
27     },
28 )
29 
30 print(completion.choices[0].message.content)

Example response (via the Python SDK):

{
    "title": "The Great Gatsby",
    "author": "F. Scott Fitzgerald",
    "publication_year": 1925
}

Tool use (function calling)

You can utilize the tool use feature by passing a list of tools to the tools parameter in the API call.

Specifying the strict parameter to True in the tool calling step will guarantee that every generated tool call follows the specified tool schema.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 tools = [
9     {
10         "type": "function",
11         "function": {
12             "name": "get_flight_info",
13             "description": "Get flight information between two cities or airports",
14             "parameters": {
15                 "type": "object",
16                 "properties": {
17                     "loc_origin": {
18                         "type": "string",
19                         "description": "The departure airport, e.g. MIA",
20                     },
21                     "loc_destination": {
22                         "type": "string",
23                         "description": "The destination airport, e.g. NYC",
24                     },
25                 },
26                 "required": ["loc_origin", "loc_destination"],
27             },
28         },
29     }
30 ]
31 
32 messages = [
33     {"role": "developer", "content": "Today is April 30th"},
34     {
35         "role": "user",
36         "content": "When is the next flight from Miami to Seattle?",
37     },
38     {
39         "role": "assistant",
40         "tool_calls": [
41             {
42                 "function": {
43                     "arguments": '{ "loc_destination": "Seattle", "loc_origin": "Miami" }',
44                     "name": "get_flight_info",
45                 },
46                 "id": "get_flight_info0",
47                 "type": "function",
48             }
49         ],
50     },
51     {
52         "role": "tool",
53         "name": "get_flight_info",
54         "tool_call_id": "get_flight_info0",
55         "content": "Miami to Seattle, May 1st, 10 AM.",
56     },
57 ]
58 
59 completion = client.chat.completions.create(
60     model="command-a-03-2025",
61     messages=messages,
62     tools=tools,
63     temperature=0.7,
64 )
65 
66 print(completion.choices[0].message)

Example response (via the Python SDK):

1 ChatCompletionMessage(content='The next flight from Miami to Seattle is on May 1st, 10 AM.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None)

Embeddings

You can generate text embeddings Embeddings API by passing a list of strings as the input parameter. You can also specify in encoding_format the format of embeddings to be generated. Can be either float or base64.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key=COHERE_API_KEY,
6 )
7 
8 response = client.embeddings.create(
9     input=["Hello world!"],
10     model="embed-v4.0",
11     encoding_format="float",
12 )
13 
14 print(
15     response.data[0].embedding[:5]
16 )  # Display the first 5 dimensions

Example response (via the Python SDK):

1 [0.0045051575, 0.046905518, 0.025543213, 0.009651184, -0.024993896]

Supported parameters

The following is the list supported parameters in the Compatibility API, including those that are not explicitly demonstrated in the examples above:

Chat completions

model
messages
stream
response_format
tools
temperature
max_tokens
stop
seed
top_p
frequency_penalty
presence_penalty

Embeddings

input
model
encoding_format

Unsupported parameters

The following parameters are not supported in the Compatibility API:

Chat completions

tool_choice
store
reasoning_effort
metadata
logit_bias
logprobs
top_logprobs
max_completion_tokens
n
modalities
prediction
audio
service_tier
stream_options
parallel_tool_calls
user

Embeddings

dimensions
user

Cohere-specific parameters

Parameters that are uniquely available on the Cohere API but not on the OpenAI SDK are not supported.

Chat endpoint:

connectors
documents
citation_options
…more here

Embed endpoint:

input_type
images
truncate
…more here

1	from openai import OpenAI
2
3	client = OpenAI(
4	base_url="https://api.cohere.ai/compatibility/v1",
5	api_key="COHERE_API_KEY",
6	)

1	Recursive call,
2	Unraveling, line by line,
3	Solving, then again.