Performing Tasks Sequentially
Compare two user queries posed to a RAG chatbot:
- “What was Apple’s revenue in 2023?"
- "What was the revenue of the most valuable company in the US in 2023?”
While the first query is straightforward, the second query requires breaking down into two steps:
- Identify the most valuable company in the US in 2023
- Get the revenue of the company in 2023
These steps need to happen in a sequence rather than all at once, because the information retrieved from the first step is required for the second step.
This is an example of sequential reasoning. In this tutorial, we’ll learn how agentic RAG with Cohere handles sequential reasoning, and in particular:
- Multi-step tool calling
- Multi-step, parallel tool calling
- Self-correction
We’ll learn these by building an agent that answers questions about using Cohere.
Setup
First, we need to install the cohere
library and create a Cohere client.
We also need to import the tool definitions from the tool_def.py
file.
tool_def.py
file in the same directory as this notebook for the imports to work correctly. Note: the source code for tool definitions can be found here
Setting up the tools
We set up the same set of tools as in Part 1, so check that out if you want further details on how to set up the tools.
Running an agentic RAG workflow
We create a run_agent
function to run the agentic RAG workflow, as in Part 1.
Multi-step tool calling
Let’s ask the agent a few questions, starting with this one about a specific feature. The user is asking about two things:
- A feature to reorder search results, and;
- Code examples for that feature;
In this case, the agent first needs to identify what that feature is before it can answer the second part of the question.
This is reflected in the agent’s tool plan, which describes the steps it will take to answer the question.
So, it first calls the search_developer_docs
tool to find the feature.
It then discovers that the feature is Rerank. Using this information, it calls the search_code_examples
tool to find code examples for that feature.
Finally, it uses the retrieved information to answer both parts of the user’s question.
Multi-step, parallel tool calling
In Part 2, we saw how the Cohere API supports tool calling in parallel and now in a sequence. That also means that both scenarios can happen at the same time.
Here’s an examples. Suppose we ask the agent to find the leaders of the top 3 countries with the largest oil reserves.
In the first step, it searches the Internet for information about the 3 countries with the largest oil reserves.
And in the second step, it performs parallel searches for the leaders of the 3 identified countries.
Self-correction
The concept of sequential reasoning is useful in a broader sense, particularly where the agent needs to adapt and change its plan midway in a task.
In other words, it allows the agent to self-correct.
To illustrate this, let’s look at an example. Here, the user is asking about the Cohere safety mode feature.
Given the nature of the question, the agent correctly identifies that it needs to find required information via the search_developer_docs
tool.
However, we know that the tool doesn’t contain this information because we have only added a small sample of documents.
As a result, the agent, having received the documents back without any relevant information, decides to search the internet instead. This is also helped by the fact that we have added specific instructions in the search_internet
tool to search the internet for information not found in the developer documentation.
It finally has the information it needs, and uses it to answer the user’s question.
This highlights another important aspect of agentic RAG, which allows a RAG system to be flexible. This is achieved by powering the retrieval component with an LLM.
On the other hand, a standard RAG system would typically hand-engineer this, and hence, is more rigid.
Summary
In this tutorial, we learned about:
- How multi-step tool calling works
- How multi-step, parallel tool calling works
- How multi-step tool calling enables an agent to self-correct, and hence, be more flexible
However, up until now, we have only worked with purely unstructured data, the type of data we typically encounter in a standard RAG system.
In the coming chapters, we’ll add another layer of complexity to the agentic RAG system – working with semi-structured and structured data. This adds another dimension to the agent’s flexibility, which is dealing with a more diverse set of data sources.
In Part 4, we’ll learn how to build an agent that can perform faceted queries over semi-structured data.