For both trial keys and production keys, Command A Vision is free until rate limits are reached. Learn more about rate limits for different models and key types here.
To use Command A Vision in production, please reach out to sales at sales@cohere.com.
Command A Vision is Cohere’s first multimodal model capable of understanding and interpreting visual data alongside text. With a 128K context length and support for up to 20 images per request, Command Vision excels at enterprise use cases including document analysis, chart interpretation, optical character recognition (OCR), and processing images featuring multiple languages. The model maintains the same API interface as other Command models, making it easy to integrate vision capabilities into existing applications.
Command A Vision is excellent in enterprise use cases such as:
Be aware that tool use isn’t supported with this model.
Also, it’s important to mention that Command A Vision can accept images as input, but doesn’t generate them.
For more detailed breakdowns of these and other applications, check out our cookbooks. To learn more about how token counts work, the maximum number of images, and so on, check out our dedicated Image Inputs document.