Models

An Overview of Cohere’s Models

Cohere has a variety of models that cover many different use cases. If you need more customization, you can train a model to tune it to your specific use case.

Cohere models are currently available on the following platforms:

At the end of each major sections below, you’ll find technical details about how to call a given model on a particular platform.

What can These Models Be Used For?

In this section, we’ll provide some high-level context on Cohere’s offerings, and what the strengths of each are.

  • The Command family of models includes Command, Command R, and Command R+. Together, they are the text-generation LLMs powering conversational agents, summarization, copywriting, and similar use cases. They work through the Chat endpoint, which can be used with or without retrieval augmented generation (RAG).
  • Rerank is the fastest way to inject the intelligence of a language model into an existing search system. It can be accessed via the Rerank endpoint.
  • Embed improves the accuracy of search, classification, clustering, and RAG results. It also powers the Embed and Classify endpoints.

Command

Command is Cohere’s default generation model that takes a user instruction (or command) and generates text following the instruction. Our Command models also have conversational capabilities which means that they are well-suited for chat applications.

Model NameDescriptionModalityContext LengthMaximum Output TokensEndpoints
command-r7b-12-2024command-r7b-12-2024 is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, tool use, agents, and similar tasks requiring complex reasoning and multiple steps.Text128k4kChat
command-r-plus-08-2024command-r-plus-08-2024 is an update of the Command R+ model, delivered in August 2024. Find more information hereText128k4kChat
command-r-plus-04-2024Command R+ is an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It is best suited for complex RAG workflows and multi-step tool use.Text128k4kChat
command-r-pluscommand-r-plus is an alias for command-r-plus-04-2024, so if you use command-r-plus in the API, that’s the model you’re pointing to.Text128k4kChat
command-r-08-2024command-r-08-2024 is an update of the Command R model, delivered in August 2024. Find more information hereText128k4kChat
command-r-03-2024Command R is an instruction-following conversational model that performs language tasks at a higher quality, more reliably, and with a longer context than previous models. It can be used for complex workflows like code generation, retrieval augmented generation (RAG), tool use, and agents.Text128k4kChat
command-rcommand-r is an alias for command-r-03-2024, so if you use command-r in the API, that’s the model you’re pointing to.Text128k4kChat
commandAn instruction-following conversational model that performs language tasks with high quality, more reliably and with a longer context than our base generative models.Text4k4kChat,
Summarize
command-nightlyTo reduce the time between major releases, we put out nightly versions of command models. For command, that is command-nightly.

Be advised that command-nightly is the latest, most experimental, and (possibly) unstable version of its default counterpart. Nightly releases are updated regularly, without warning, and are not recommended for production use.
Text128k4kChat
command-lightA smaller, faster version of command. Almost as capable, but a lot faster.Text4k4kChat,
Summarize
command-light-nightlyTo reduce the time between major releases, we put out nightly versions of command models. For command-light, that is command-light-nightly.

Be advised that command-light-nightly is the latest, most experimental, and (possibly) unstable version of its default counterpart. Nightly releases are updated regularly, without warning, and are not recommended for production use.
Text4k4kChat
c4ai-aya-expanse-8bAya Expanse is a highly performant 8B multilingual model, designed to rival monolingual performance through innovations in instruction tuning with data arbitrage, preference training, and model merging. Serves 23 languages.Text8k4kChat
c4ai-aya-expanse-32bAya Expanse is a highly performant 32B multilingual model, designed to rival monolingual performance through innovations in instruction tuning with data arbitrage, preference training, and model merging. Serves 23 languages.Text128k4kChat

Using Command Models on Different Platforms

In this table, we provide some important context for using Cohere Command models on Amazon Bedrock, Amazon SageMaker, and more.

Model NameAmazon Bedrock Model IDAmazon SageMakerAzure AI Studio Model IDOracle OCI Generative AI Service
command-r7b-12-2024(Coming soon)(Coming soon)(Coming soon)(Coming soon)
command-r-pluscohere.command-r-plus-v1:0Unique per deploymentUnique per deploymentcohere.command-r-plus v1.2
command-rcohere.command-r-v1:0Unique per deploymentUnique per deploymentcohere.command-r-16k v1.2
commandcohere.command-text-v14N/AN/Acohere.command v15.6
command-nightlyN/AN/AN/AN/A
command-lightcohere.command-light-text-v14N/AN/Acohere.command-light v15.6
command-light-nightlyN/AN/AN/AN/A

Embed

These models can be used to generate embeddings from text or classify it based on various parameters. Embeddings can be used for estimating semantic similarity between two sentences, choosing a sentence which is most likely to follow another sentence, or categorizing user feedback, while outputs from the Classify endpoint can be used for any classification or analysis task. The Representation model comes with a variety of helper functions, such as for detecting the language of an input.

Model NameDescriptionModalitiesDimensionsContext LengthSimilarity MetricEndpoints
embed-english-v3.0A model that allows for text to be classified or turned into embeddings. English only.Text, Images1024512Cosine SimilarityEmbed,
Embed Jobs
embed-english-light-v3.0A smaller, faster version of embed-english-v3.0. Almost as capable, but a lot faster. English only.Text, Images384512Cosine SimilarityEmbed,
Embed Jobs
embed-multilingual-v3.0Provides multilingual classification and embedding support. See supported languages here.Text, Images1024512Cosine SimilarityEmbed, Embed Jobs
embed-multilingual-light-v3.0A smaller, faster version of embed-multilingual-v3.0. Almost as capable, but a lot faster. Supports multiple languages.Text, Images384512Cosine SimilarityEmbed,
Embed Jobs
embed-english-v2.0Our older embeddings model that allows for text to be classified or turned into embeddings. English onlyText4096512Cosine SimilarityClassify, Embed
embed-english-light-v2.0A smaller, faster version of embed-english-v2.0. Almost as capable, but a lot faster. English only.Text1024512Cosine SimilarityClassify, Embed
embed-multilingual-v2.0Provides multilingual classification and embedding support. See supported languages here.Text768256Dot Product SimilarityClassify, Embed

In this table we’ve listed older v2.0 models alongside the newer v3.0 models, but we recommend you use the v3.0 versions.

Using Embed Models on Different Platforms

In this table, we provide some important context for using Cohere Embed models on Amazon Bedrock, Amazon SageMaker, and more.

Model NameAmazon Bedrock Model IDAmazon SageMakerAzure AI Studio Model IDOracle OCI Generative AI Service
embed-english-v3.0cohere.embed-english-v3Unique per deploymentUnique per deploymentcohere.embed-english-v3.0
embed-english-light-v3.0N/AN/AN/Acohere.embed-english-light-v3.0
embed-multilingual-v3.0cohere.embed-multilingual-v3Unique per deploymentUnique per deploymentcohere.embed-multilingual-v3.0
embed-multilingual-light-v3.0N/AN/AN/Acohere.embed-multilingual-light-v3.0
embed-english-v2.0N/AN/AN/AN/A
embed-english-light-v2.0N/AN/AN/Acohere.embed-english-light-v2.0
embed-multilingual-v2.0N/AN/AN/AN/A

Rerank

The Rerank model can improve created models by re-organizing their results based on certain parameters. This can be used to improve search algorithms.

Model NameDescriptionModalitiesContext LengthEndpoints
rerank-v3.5A model that allows for re-ranking English Language documents and semi-structured data (JSON). This model has a context length of 4096 tokens.Text4kRerank
rerank-english-v3.0A model that allows for re-ranking English Language documents and semi-structured data (JSON). This model has a context length of 4096 tokens.Text4kRerank
rerank-multilingual-v3.0A model for documents and semi-structure data (JSON) that are not in English. Supports the same languages as embed-multilingual-v3.0. This model has a context length of 4096 tokens.Text4kRerank

Using Rerank Models on Different Platforms

In this table, we provide some important context for using Cohere Rerank models on Amazon Bedrock, SageMaker, and more.

Model NameAmazon Bedrock Model IDAmazon SageMakerAzure AI Studio Model IDOracle OCI Generative AI Service
rerank-v3.5cohere.rerank-v3-5:0Unique per deploymentNot yet availableN/A
rerank-english-v3.0N/AUnique per deploymentNot yet availableN/A
rerank-multilingual-v3.0N/AUnique per deploymentNot yet availableN/A

Rerank accepts full strings rather than tokens, so the token limit works a little differently. Rerank will automatically chunk documents longer than 510 tokens, and there is therefore no explicit limit to how long a document can be when using rerank. See our best practice guide for more info about formatting documents for the Rerank endpoint.