Chat (V1)
Generates a text response to a user message. To learn how to use the Chat API and RAG follow our Text Generation guides.
Headers
Bearer authentication of the form Bearer <token>
, where token is your auth token.
Pass text/event-stream to receive the streamed response as server-sent events. The default is \n
delimited events.
Request
Text input for the model to respond to.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
The name of a compatible Cohere model or the ID of a fine-tuned model.
Compatible Deployments: Cohere Platform, Private Deployments
An alternative to chat_history
.
Providing a conversation_id
creates or resumes a persisted conversation with the specified ID. The ID can be any non empty string.
Compatible Deployments: Cohere Platform
Defaults to "accurate"
.
Dictates the approach taken to generating citations as part of the RAG flow by allowing the user to specify whether they want "accurate"
results, "fast"
results or no results.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
The maximum number of tokens the model will generate as part of the response. Note: Setting a low value may result in incomplete generations.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
The maximum number of input tokens to send to the model. If not specified, max_input_tokens
is the model’s context length limit minus a small buffer.
Input will be truncated according to the prompt_truncation
parameter.
Compatible Deployments: Cohere Platform
Ensures only the top k
most likely tokens are considered for generation at each step.
Defaults to 0
, min value of 0
, max value of 500
.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
If specified, the backend will make a best effort to sample tokens deterministically, such that repeated requests with the same seed and parameters should return the same result. However, determinism cannot be totally guaranteed.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
A list of up to 5 strings that the model will use to stop generation. If the model generates a string that matches any of the strings in the list, it will stop generating tokens and return the generated text up to that point not including the stop sequence.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
When enabled, the user’s prompt will be sent to the model without any pre-processing.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
Forces the chat to be single step. Defaults to false
.
Defaults to false
.
When true
, the response will only contain a list of generated search queries, but no search will take place, and no reply from the model to the user’s message
will be generated.
Compatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments
Response
A list of previous messages between the user and the model, meant to give the model conversational context for responding to the user’s message
.