Announcing Cohere's Command A Vision Model
Weโre excited to announce the release of Command A Vision, Cohereโs first commercial model capable of understanding and interpreting visual data alongside text. This addition to our Command family brings enterprise-grade vision capabilities to your applications with the same familiar Command API interface.
Key Features
Multimodal Capabilities
- Text + Image Processing: Combine text prompts with image inputs
- Enterprise-Focused Use Cases: Optimized for business applications like document analysis, chart interpretation, and OCR
- Multiple Languages: Officially supports English, Portuguese, Italian, French, German, and Spanish
Technical Specifications
- Model Name:
command-a-vision-07-2025
- Context Length: 128K tokens
- Maximum Output: 8K tokens
- Image Support: Up to 20 images per request (or 20MB total)
- API Endpoint: Chat API
What You Can Do
Command A Vision excels in enterprise use cases including:
- ๐ Chart & Graph Analysis: Extract insights from complex visualizations
- ๐ Table Understanding: Parse and interpret data tables within images
- ๐ Document OCR: Optical character recognition with natural language processing
- ๐ Image Processing for Multiple Languages: Handle text in images across multiple languages
- ๐ Scene Analysis: Identify and describe objects within images
๐ป Getting Started
The API structure is identical to our existing Command models, making integration straightforward:
Thereโs much more to be said about working with images, various limitations, and best practices, which you can find in our dedicated Command A Vision and Image Inputs documents.
Announcing Cutting-Edge Cohere Models on OCI
We are thrilled to announce that the Oracle Cloud Infrastructure (OCI) Generative AI service now supports Cohere Command A, Rerank v3.5, Embed v3.0 multimodal. This marks a major advancement in providing OCIโs customers with enterprise-ready AI solutions.
Command A 03-2025 is the most performant Command model to date, delivering 150% of the throughput of its predecessor on only two GPUs.
Embed v3.0 is a cutting-edge AI search model enhanced with multimodal capabilities, allowing it to generate embeddings from both text and images.
Rerank 3.5, Cohereโs newest AI search foundation model, is engineered to improve the precision of enterprise search and retrieval-augmented generation (RAG) systems across a wide range of data formats (such as lengthy documents, emails, tables, JSON, and code) and in over 100 languages.
Check out Oracleโs announcement and documentation for more details.
Announcing Embed Multimodal v4
Weโre thrilled to announce the release of Embed 4, the most recent entrant into the Embed family of enterprise-focusedย large language modelsย (LLMs).
Embed v4 is Cohereโs most performant search model to date, and supports the following new features:
- Matryoshka Embeddings in the following dimensions: โ[256, 512, 1024, 1536]โ
- Unified Embeddings produced from mixed modality input (i.e. a single payload of image(s) and text(s))
- Context length of 128k
Embed v4 achieves state of the art in the following areas:
- Text-to-text retrieval
- Text-to-image retrieval
- Text-to-mixed modality retrieval (from e.g. PDFs)
Embed v4 is available today on the Cohere Platform, AWS Sagemaker, and Azure AI Foundry. For more information, check out our dedicated blog post here.
Announcing Command A
Weโre thrilled to announce the release of Command A, the most recent entrant into the Command family of enterprise-focused large language models (LLMs).
Command A is Cohereโs most performant model to date, excelling at real world enterprise tasks including tool use, retrieval augmented generation (RAG), agents, and multilingual use cases. With 111B parameters and a context length of 256K, Command A boasts a considerable increase in inference-time efficiency โ 150% higher throughput compared to its predecessor Command R+ 08-2024 โ and only requires two GPUs (A100s / H100s) to run.
Command A is available today on the Cohere Platform, HuggingFace, or through the SDK with command-a-03-2025
. For more information, check out our dedicated blog post.
Our Groundbreaking Multimodal Model, Aya Vision, is Here!
Today, Cohere Labs, Cohereโs research arm, is proud to announce Aya Vision, a state-of-the-art multimodal large language model excelling across multiple languages and modalities. Aya Vision outperforms the leading open-weight models in critical benchmarks for language, text, and image capabilities.
built as a foundation for multilingual and multimodal communication, this groundbreaking AI model supports tasks such as image captioning, visual question answering, text generation, and translations from both texts and images into coherent text.
Find more information about Aya Vision here.
Cohere Releases Arabic-Optimized Command Model!
Cohere is thrilled to announce the release of Command R7B Arabic (c4ai-command-r7b-12-2024
). This is an open weights release of an advanced, 8-billion parameter custom model optimized for the Arabic language (MSA dialect), in addition to English. As with Cohereโs other command models, this one comes with context length of 128,000 tokens; it excels at a number of critical enterprise tasks โ instruction following, length control, retrieval-augmented generation (RAG), minimizing code-switching โ and it demonstrates excellent general purpose knowledge and understanding of the Arabic language and culture.
Try Command R7B Arabic
If you want to try Command R7B Arabic, itโs very easy: you can use it through the Cohere playground or in our dedicated Hugging Face Space.
Alternatively, you can use the model in your own code. To do that, first install the transformers
library from its source repository:
Then, use this Python snippet to run a simple text-generation task with the model:
Chat Capabilities
Command R7B Arabic can be operated in two modes, โconversationalโ and โinstructโ mode:
- Conversational mode conditions the model on interactive behaviour, meaning it is expected to reply in a conversational fashion, provide introductory statements and follow-up questions, and use Markdown as well as LaTeX where appropriate. This mode is optimized for interactive experiences, such as chatbots, where the model engages in dialogue.
- Instruct mode conditions the model to provide concise yet comprehensive responses, and to not use Markdown or LaTeX by default. This mode is designed for non-interactive, task-focused use cases such as extracting information, summarizing text, translation, and categorization.
Multilingual RAG Capabilities
Command R7B Arabic has been trained specifically for Arabic and English tasks, such as the generation step of Retrieval Augmented Generation (RAG).
Command R7B Arabicโs RAG functionality is supported through chat templates in Transformers. Using our RAG chat template, the model takes a conversation (with an optional user-supplied system preamble) and a list of document snippets as input. The resulting output contains a response with in-line citations. Hereโs what that looks like:
You can then generate text from this input as normal.
Notes on Usage
We recommend document snippets be short chunks (around 100-400 words per chunk) instead of long documents. They should also be formatted as key-value pairs, where the keys are short descriptive strings and the values are either text or semi-structured.
You may find that simply including relevant documents directly in a user message works as well as or better than using the documents
parameter to render the special RAG template (though the template is a strong default for those wanting citations). We encourage users to experiment with both approaches, and to evaluate which mode works best for their specific use case.
Cohere via OpenAI SDK Using Compatibility API
Today, we are releasing our Compatibility API, enabling developers to seamlessly use Cohereโs models via OpenAIโs SDK.
This API enables you to switch your existing OpenAI-based applications to use Cohereโs models without major refactoring.
It includes comprehensive support for chat completions, such as function calling and structured outputs, as well as support for text embeddings generation.
Check out our documentation on how to get started with the Compatibility API, with examples in Python, TypeScript, and cURL.
Cohere's Rerank v3.5 Model is on Azure AI Foundry!
In December 2024, Cohere released Rerank v3.5 model. It demonstrates SOTA performance on multilingual retrieval, reasoning, and tasks in domains as varied as finance, eCommerce, hospitality, project management, and email/messaging retrieval.
This model has been available through the Cohere API, but today weโre pleased to announce that it can also be utilized through Microsoft Azureโs AI Foundry!
You can find more information about using Cohereโs embedding models on AI Foundry here.
Cohere's Rerank v3.5 Model is on Azure AI Foundry!
In December 2024, Cohere released Rerank v3.5 model. It demonstrates SOTA performance on multilingual retrieval, reasoning, and tasks in domains as varied as finance, eCommerce, hospitality, project management, and email/messaging retrieval.
This model has been available through the Cohere API, but today weโre pleased to announce that it can also be utilized through Microsoft Azureโs AI Foundry!
You can find more information about using Cohereโs embedding models on AI Foundry here.
Deprecation of Classify via default Embed Models
Effective January 31st, 2025, we are deprecating the use of default Embed models with the Classify endpoint.
This deprecation does not affect usage of the Classify endpoint with fine-tuned Embed models. Fine-tuned models continue to be fully supported and are recommended for achieving optimal classification performance.
For guidance on implementing Classify with fine-tuned models, please refer to our Classify fine-tuning documentation.