Reasoning Capabilities
Reasoning models represent an advanced approach to AI that enables more sophisticated problem-solving capabilities. Cohere’s reasoning models are hybrid, meaning reasoning can be enabled (in which case they generate internal reasoning processes before delivering their final responses) or disabled (in which case they function the way any other LLM would).
How Reasoning Models Work
When a reasoning model processes a request, it first works internally to break the problem down step-by-step. This reasoning process happens in dedicated “thinking” content blocks where the model works through its analysis, planning, and logical steps. Only after completing this internal reasoning does the model produce its final text response, and this allows them to tackle complex tasks with deeper analysis.
The key benefit is that reasoning models can handle complex problems—such as leveraging tools and agentic problem solving in the 23 supported languages—by first working through the problem internally before presenting a well-reasoned solution. This approach leads to more accurate and thorough responses, while pushing the boundary for the complexity of problems the model is able to solve.
Getting Started
Models with Reasoning capabilities are accessible via the Chat API. Here’s an example:
Enabling / Disabling Reasoning Capabilities
For reasoning models, thinking
is enabled by default. To disable it, send the following value to the "thinking"
parameter:
Thinking Budgets
A thinking token budget can also be specified, to set an upper limit on how many thinking tokens the model can produce. Our recommendation is to use unlimited thinking (i.e. reasoning = on
). However, if you plan to use thinking budgets, please make sure to leave at least 1K tokens for the response. For example, if you want the model to reason until the maximum limit, we recommend 31K as the token budget.
When the budget is exceeded, the model will immediately proceed with the final response.
Use Cases and Applications
Reasoning models excel at tasks that benefit from step-by-step analysis, including:
- Agentic Use Cases: Taking autonomous actions and interacting with the environment to solve problems.
- Tool Use: Able to leverage a variety of tools, such as search engines and APIs.
- Multilingual: Able to reason over multilingual inputs, providing support to user queries in 23 different languages.
Technical Implementation
The reasoning process is controlled through specific parameters that allow developers to:
- Enable or disable reasoning capabilities
- Set token budgets to control the depth of reasoning
- Stream responses to see reasoning and final answers in real-time
This architecture makes reasoning models particularly valuable for applications requiring high accuracy, transparency in reasoning, and the ability to handle complex, multi-faceted problems that benefit from systematic analysis.