Building a Chatbot with Cohere
As its name implies, the Chat endpoint enables developers to build chatbots that can handle conversations. At the core of a conversation is a multi-turn dialog between the user and the chatbot. This requires the chatbot to have the state (or “memory”) of all the previous turns to maintain the state of the conversation.
In this tutorial, you’ll learn about:
- Creating a custom preamble
- Creating a single-turn conversation
- Building the conversation memory
- Running a multi-turn conversation
- Viewing the chat history
You’ll learn these by building an onboarding assistant for new hires.
Setup
To get started, first we need to install the cohere
library and create a Cohere client.
Creating a custom preamble
A conversation starts with a system message, or a preamble, to help steer a chatbot’s response toward certain characteristics.
For example, if we want the chatbot to adopt a formal style, the preamble can be used to encourage the generation of more business-like and professional responses.
The recommended approach is to use two H2 Markdown headers: “Task and Context” and “Style Guide” in the exact order.
In the example below, the preamble provides context for the assistant’s task (task and context) and encourages the generation of rhymes as much as possible (style guide).
Further reading:
Creating a single-turn conversation
Let’s start with a single-turn conversation, which doesn’t require the chatbot to maintain any conversation state.
Here, we are also adding a custom preamble for generating concise response, just to keep the outputs brief for this tutorial.
Building the conversation memory
Now, we want the model to refine the earlier response. This requires the next generation to have access to the state, or memory, of the conversation.
To do this, we add the chat_history
argument, which takes the current chat history as the value.
You can get the current chat history by taking the the response.chat_history
object from the previous response.
Looking at the response, we see that the model is able to get the context from the chat history. The model is able to capture that “it” in the user message refers to the introduction message it had generated earlier.
Further reading:
Running a multi-turn conversation
You can continue doing this for any number of turns by passing the most recent response.chat_history
value, which contains the conversation history from the beginning.
Viewing the chat history
To look at the current chat history, you can print the response.chat_history
object, which contains a list of USER
and CHATBOT
turns in the same sequence as they were created.
Conclusion
In this tutorial, you learned about:
- How to create a custom preamble
- How to create a single-turn conversation
- How to build the conversation memory
- How to run a multi-turn conversation
- How to view the chat history
You will use the same method for running a multi-turn conversation when you learn about other use cases such as RAG (Part 6) and tool use (Part 7).
But to fully leverage these other capabilities, you will need another type of language model that generates text representations, or embeddings.
In Part 4, you will learn how text embeddings can power an important use case for RAG, which is semantic search.