Command models get an August refresh

Today we’re announcing updates to our flagship generative AI model series: Command R and Command R+. These models demonstrate improved performance on a variety of tasks.

The latest model versions are designated with timestamps, as follows:

The updated Command R is command-r-08-2024 on the API.
The updated Command R+ is command-r-plus-08-2024 on the API.

In the rest of these release notes, we’ll provide more details about technical enhancements, new features, and new pricing.

Technical Details

command-r-08-2024 shows improved performance for multilingual retrieval-augmented generation (RAG) and tool use. More broadly, command-r-08-2024 is better at math, code and reasoning and is competitive with the previous version of the larger Command R+ model.

command-r-08-2024 delivers around 50% higher throughput and 20% lower latencies as compared to the previous Command R version, while cutting the hardware footprint required to serve the model by half. Similarly, command-r-plus-08-2024 delivers roughly 50% higher throughput and 25% lower latencies as compared to the previous Command R+ version, while keeping the hardware footprint the same.

Both models include the following feature improvements:

For tool use, command-r-08-2024 and command-r-plus-08-2024 have demonstrated improved decision-making around which tool to use in which context, and whether or not to use a tool.
Improved instruction following in the preamble.
Improved multilingual RAG searches in the language of the user with improved responses.
Better structured data analysis for structured data manipulation.
Better structured data creation from unstructured natural language instructions.
Improved robustness to non-semantic prompt changes like white space or new lines.
The models will decline unanswerable questions.
The models have improved citation quality and users can now turn off citations for RAG workflows.
For command-r-08-2024 there are meaningful improvements on length and formatting control.

New Feature: Safety Modes

The primary new feature available in both command-r-08-2024 and command-r-plus-08-2024 is Safety Modes (in beta). For our enterprise customers building with our models, what is considered safe depends on their use case and the context the model is deployed in. To support diverse enterprise applications, we have developed safety modes, acknowledging that safety and appropriateness are context-dependent, and that predictability and control are critical in building confidence in Cohere models.

Safety guardrails have traditionally been reactive and binary, and we’ve observed that users often have difficulty defining what safe usage means to them for their use case. Safety Modes introduce a nuanced approach that is context sensitive.

(Note: Command R/R+ have built-in protections against core harms, such as content that endangers child safety. These types of harm are always blocked and cannot be adjusted.)

Safety modes are activated through a safety_mode parameter, which can (currently) be in one of two modes:

"STRICT": Encourages avoidance of all sensitive topics. Strict content guardrails provide an extra safe experience by prohibiting inappropriate responses or recommendations. Ideal for general and enterprise use.
"CONTEXTUAL": (enabled by default): For wide-ranging interactions with fewer constraints on output while maintaining core protections. The model responds as instructed while still rejecting harmful or illegal suggestions. Well-suited for entertainment, creative, educational use.

You can also opt out of the safety modes beta by setting safety_mode="NONE". For more information, check out our dedicated guide to Safety Modes.

Pricing

Here’s a breakdown the pricing structure for the new models:

For command-r-plus-08-2024, input tokens are priced at $2.50/M and output tokens at $10.00/M.
For command-r-08-2024, input tokens are priced at $0.15/M and output tokens at $0.60/M.