Building a Chatbot with Cohere
As its name implies, the Chat endpoint enables developers to build chatbots that can handle conversations. At the core of a conversation is a multi-turn dialog between the user and the chatbot. This requires the chatbot to have the state (or “memory”) of all the previous turns to maintain the state of the conversation.
In this tutorial, you’ll learn about:
- Creating a custom preamble
- Creating a single-turn conversation
- Building the conversation memory
- Running a multi-turn conversation
- Viewing the chat history
You’ll learn these by building an onboarding assistant for new hires.
Setup
To get started, first we need to install the cohere
library and create a Cohere client.
Creating a custom preamble
A conversation starts with a system message, or a preamble, to help steer a chatbot’s response toward certain characteristics.
For example, if we want the chatbot to adopt a formal style, the preamble can be used to encourage the generation of more business-like and professional responses.
The recommended approach is to use two H2 Markdown headers: “Task and Context” and “Style Guide” in the exact order.
In the example below, the preamble provides context for the assistant’s task (task and context) and encourages the generation of rhymes as much as possible (style guide).
Further reading:
Starting the first conversation turn
Let’s start with the first conversation turn.
Here, we are also adding a custom preamble or system message for generating a concise response, just to keep the outputs brief for this tutorial.
Building the conversation memory
Now, we want the model to refine the earlier response. This requires the next generation to have access to the state, or memory, of the conversation.
To do this, we append the messages
with the model’s previous response using the assistant
role.
Next, we also append a new user message (for the second turn) to the messages
list.
Looking at the response, we see that the model is able to get the context from the chat history. The model is able to capture that “it” in the user message refers to the introduction message it had generated earlier.
Further reading:
Running a multi-turn conversation
You can continue doing this for any number of turns by continuing to append the chatbot’s response and the new user message to the messages
list.
Viewing the chat history
To look at the current chat history, you can print the messages
list, which contains a list of user
and assistant
turns in the same sequence as they were created.
Conclusion
In this tutorial, you learned about:
- How to create a custom preamble
- How to create a single-turn conversation
- How to build the conversation memory
- How to run a multi-turn conversation
- How to view the chat history
You will use the same method for running a multi-turn conversation when you learn about other use cases such as RAG (Part 6) and tool use (Part 7).
But to fully leverage these other capabilities, you will need another type of language model that generates text representations, or embeddings.
In Part 4, you will learn how text embeddings can power an important use case for RAG, which is semantic search.