In this document, you learn how to use Azure AI Foundry to deploy the Cohere Command, Embed, and Rerank models on Microsoft’s Azure cloud computing platform. You can read more about Azure AI Foundry in its documentationhere.
The following models are available through Azure AI Foundry with pay-as-you-go, token-based billing:
Whether you’re using Command, Embed, or Rerank, the initial set up is the same. You’ll need:
East US, East US 2, North Central US, South Central US, Sweden Central, West US or West US 3 regions.For workflows based around Command, Embed, or Rerank, you’ll also need to create a deployment and consume the model. Here are links for more information:
We expose two routes for Command R and Command R+ inference:
v1/chat/completions adheres to the Azure AI Generative Messages API schema; v1/chat supports Cohere’s native API schema.You can find more information about Azure’s API here.
Here’s a code snippet demonstrating how to programmatically interact with a Cohere model on Azure:
You can find more code snippets, including examples of how to stream responses, in this notebook.
Though this section is called “Text Generation”, it’s worth pointing out that these models are capable of much more. Specifically, you can use Azure-hosted Cohere models for both retrieval augmented generation and multi-step tool use. Check the linked pages for much more information.
Finally, we released refreshed versions of Command R and Command R+ in August 2024, both of which are now available on Azure. Check these Microsoft docs for more information (select the Cohere Command R 08-2024 or Cohere Command R+ 08-2024 tabs).
We expose two routes for Embed v4 and Embed v3 inference:
v1/embeddings adheres to the Azure AI Generative Messages API schema; v1/embed supports Cohere’s native API schema.You can find more information about Azure’s API here.
We currently exposes the v1/rerank endpoint for inference with Rerank v4.0 Pro, Rerank v4.0 Fast, Rerank v3.5, Rerank v3 English, and Rerank 3 Multilingual. For more information on using the APIs, see the reference section.
You can use the Cohere SDK client to consume Cohere models that are deployed via Azure AI Foundry. This means you can leverage the SDK’s features such as RAG, tool use, structured outputs, and more.
The following are a few examples on how to use the SDK for the different models.
Here are some other examples for Command and Embed.
The important thing to understand is that our new and existing customers can call the models from Azure while still leveraging their integration with the Cohere SDK.