Text Generation
In this tutorial, we’ll explore text generation using Cohere’s Command model on Azure AI Foundry.
Text generation is a fundamental capability that enables LLMs systems to generate text for various applications, such as providing detailed responses to questions, helping with writing and editing tasks, creating conversational responses, and assisting with code generation and documentation.
In this tutorial, we’ll cover:
- Setting up the Cohere client
- Basic text generation
- Other typical use cases
- Building a chatbot
We’ll use Cohere’s Command model deployed on Azure to demonstrate these capabilities and help you understand how to effectively use text generation in your applications.
Setup
First, you will need to deploy the Command model on Azure via Azure AI Foundry. The deployment will create a serverless API with pay-as-you-go token based billing. You can find more information on how to deploy models in the Azure documentation.
In the example below, we are deploying the Command R+ (August 2024) model.
Once the model is deployed, you can access it via Cohere’s Python SDK. Let’s now install the Cohere SDK and set up our client.
To create a client, you need to provide the API key and the model’s base URL for the Azure endpoint. You can get these information from the Azure AI Foundry platform where you deployed the model.
Creating some contextual information
Before we begin, let’s create some context to use in our text generation tasks. In this example, we’ll use a set of technical support frequently asked questions (FAQs) as our context.
Helper function to generate text
Now, let’s define a function to generate text using the Command R+ model on Bedrock. We’ll use this function a few times throughout.
This function takes a user message and generates the response via the chat endpoint. Note that we don’t need to specify the model as we have already set it up in the client.
Text generation
Let’s explore basic text generation as our first use case. The model takes a prompt as input and produces a relevant response as output.
Consider a scenario where a customer support agent uses an LLM to help draft responses to customer inquiries. The agent provides technical support FAQs as context along with the customer’s question. The prompt is structured to include three components: the instruction, the context (FAQs), and the specific customer inquiry.
After passing this prompt to our generate_text
function, we receive a response object. The actual generated text can be accessed through the response.text
attribute.
Text summarization
Another type of use case is text summarization. Now, let’s summarize the customer inquiry into a single sentence. We add an instruction to the prompt and then pass the inquiry to the prompt.
Text rewriting
Text rewriting is a powerful capability that allows us to adapt content for different purposes while preserving the core message. This involves transforming the style, tone, or format of text to better suit the target audience or medium.
Let’s look at an example where we convert a customer support chat response into a formal email. We’ll construct the prompt by first stating our goal to rewrite the text, then providing the original chat response as context.
Build a Chatbot
While our previous examples were single-turn interactions, the Chat endpoint enables us to create chatbots that maintain memory of past conversation turns. This capability allows developers to build conversational applications that preserve context throughout the dialogue.
Below, we implement a basic customer support chatbot that acts as a helpful service agent. We’ll create a function called run_chatbot that handles the conversation flow and displays messages and events. The function can take an optional chat history parameter to maintain conversational context across multiple turns.
For this, we introduce a couple of additional parameters to the Chat endpoint:
preamble
: A preamble contains instructions to help steer a chatbot’s response toward specific characteristics, such as a persona, style, or format. Here, we are using a simple preamble: “You are a helpful customer support agent that assists customers of a mobile network service.”chat_history
: We store the history of a conversation between a user and the chatbot as a list, append every new conversation turn, and pass this information to the next endpoint call.
View the chat history
Here’s what is contained in the chat history after a few turns.
Summary
In this tutorial, we learned about:
- How to set up the Cohere client to use the Command model deployed on Azure AI Foundry
- How to perform basic text generation
- How to use the model for other types of use cases
- How to build a chatbot using the Chat endpoint
In the next tutorial, we’ll explore how to use the Embed model in semantic search applications.