Agentic Multi-Stage RAG with Cohere Tools API

Jason Jung

Motivation

Retrieval augmented generation (RAG) has been a go-to use case that enterprises have been adopting with large language models (LLMs). Even though it works well in general, there are edge cases where this can fail. Most commonly, when the retrieved document mentions the query but actually refers to another document, the model will fail to generate the correct answer.

We propose an agentic RAG system that leverages tool use to continue to retrieve documents if correct ones were not retrieved at first try. This is ideal for use cases where accuracy is a top priority and latency is not. For example, lawyers trying to find the most accurate answer from their contracts are willing to wait a few more seconds to get the answer instead of getting wrong answers fast.

Objective

This notebook, we will explore how we can build a simple agentic RAG using Cohere’s native API. We have prepared a fake dataset to demonstrate the use case. We ask three questions that require different depths of retrieval. We will see how the model answers the question between simple and agentic RAG.

Disclaimer

One of the challenges in building a RAG system is that it has many moving pieces: vector database, type of embedding model, use of reranker, number of retrieved documents, chunking strategy, and more. These components can make debugging and evaluating RAG systems difficult. Since this notebook focuses on the concept of agentic RAG, it will simplify other parts of the RAG system. For example, we will only retrieve top 1 document to demonstrate what happens when retrieved document does not contain the answer needed.

Result

Type	Question	Simple Rag	Agentic Rag
Single-stage retrieval	Is there a state level law for wearing helmets?	There is currently no state law requiring the use of helmets when riding a bicycle. However, some cities and counties do require helmet use.	There is currently no state law requiring helmet use. However, some cities and counties do require helmet use with bicycles.
Multi-stage retrieval	I live in orting, do I need to wear a helmet with a bike?	In the state of Washington, there is no law requiring you to wear a helmet when riding a bike. However, some cities and counties do require helmet use, so it is worth checking your local laws.	Yes, you do need to wear a helmet with a bike in Orting if you are under 17.

As you will see more below, the multi-stage retrieval is achieved by adding a new function reference_extractor() that extracts other references in the documents and updating the instruction so the agent continues to retrieve more documents.

PYTHON

1 import os
2 from pprint import pprint
3 
4 import cohere
5 import pandas as pd
6 from sklearn.metrics.pairwise import cosine_similarity

PYTHON

1 # versions
2 print('cohere version:', cohere.__version__)

Output

cohere version: 5.5.1

Setup

PYTHON

1 COHERE_API_KEY = os.environ.get("CO_API_KEY")
2 COHERE_MODEL = 'command-a-03-2025'
3 co = cohere.Client(api_key=COHERE_API_KEY)

Data

We leveraged data from Washington Department of Transportation and modified to fit the need of this demo.

PYTHON

1 documents = [
2     {
3         "title": "Bicycle law",
4         "body": """
5         Traffic Infractions and fees - For all information related to bicycle traffic infractions such as not wearing a helmet and fee information, please visit Section 3b for more information.
6         Riding on the road - When riding on a roadway, a cyclist has all the rights and responsibilities of a vehicle driver (RCW 46.61.755). Bicyclists who violate traffic laws may be ticketed (RCW 46.61.750).
7         Roads closed to bicyclists - Some designated sections of the state's limited access highway system may be closed to bicyclists. See the permanent bike restrictions map for more information. In addition, local governments may adopt ordinances banning cycling on specific roads or on sidewalks within business districts.
8         Children bicycling - Parents or guardians may not knowingly permit bicycle traffic violations by their ward (RCW 46.61.700).
9         Riding side by side - Bicyclists may ride side by side, but not more than two abreast (RCW 46.61.770).
10         Riding at night - For night bicycle riding, a white front light (not a reflector) visible for 500 feet and a red rear reflector are required. A red rear light may be used in addition to the required reflector (RCW 46.61.780).
11         Shoulder vs. bike lane - Bicyclists may choose to ride on the path, bike lane, shoulder or travel lane as suits their safety needs (RCW 46.61.770).
12         Bicycle helmets - Currently, there is no state law requiring helmet use. However, some cities and counties do require helmets. For specific information along with location for bicycle helmet law please reference to section 21a.
13         Bicycle equipment - Bicycles must be equipped with a white front light visible for 500 feet and a red rear reflector (RCW 46.61.780). A red rear light may be used in addition to the required reflector.
14 """,
15     },
16     {
17         "title": "Bicycle helmet requirement",
18         "body": "Currently, there is no state law requiring helmet use. However, some cities and counties do require helmet use with bicycles. Here is a list of those locations and when the laws were enacted. For specific information along with location for bicycle helmet law please reference to section 21a.",
19     },
20     {
21         "title": "Section 21a",
22         "body": """helmet rules by location: These are city and county level rules. The following group must wear helmets.
23         Location name | Who is affected | Effective date
24         Aberdeen | All ages | 2001
25         Bainbridge Island | All ages | 2001
26         Bellevue | All ages | 2001
27         Bremerton | All ages | 2000
28         DuPont | All ages | 2008
29         Eatonville | All ages | 1996
30         Fircrest | All ages | 1995
31         Gig Harbor | All ages | 1996
32         Kent | All ages | 1999
33         Lynnwood | All ages | 2004
34         Lakewood | All ages | 1996
35         Milton | All ages | 1997
36         Orting | Under 17 | 1997
37 
38      For fines and rules, you will be charged in according with Section 3b of the law.
39      """,
40     },
41     {
42         "title": "Section 3b",
43         "body": """Traffic infraction - A person operating a bicycle upon a roadway or highway shall be subject to the provisions of this chapter relating to traffic infractions.
44         1. Stop for people in crosswalks. Every intersection is a crosswalk - It’s the law. Drivers must stop for pedestrians at intersections, whether it’s an unmarked or marked crosswalk, and bicyclists in crosswalks are considered pedestrians. Also, it is illegal to pass another vehicle stopped for someone at a crosswalk. In Washington, the leading action motorists take that results in them hitting someone is a failure to yield to pedestrians.
45         2. Put the phone down. Hand-held cell phone use and texting is prohibited for all Washington drivers and may result in a $136 fine for first offense, $235 on the second distracted-driving citation.
46         3. Helmets are required for all bicyclists according to the state and municipal laws. If you are in a group required to wear a helmet but do not wear it you can be fined $48. # If you are the parent or legal guardian of a child under 17 and knowingly allow them to ride without a helmet, you can be fined $136.
47 """,
48     },
49 ]
50 db = pd.DataFrame(documents)
51 # comebine title and body
52 db["combined"] = "Title: " + db["title"] + "\n" + "Body: " + db["body"]
53 # generate embedding
54 embeddings = co.embed(
55     texts=db.combined.tolist(), model="embed-v4.0", input_type="search_document"
56 )
57 db["embeddings"] = embeddings.embeddings

PYTHON

1 db

	title	body	combined	embeddings
0	Bicycle law	\n Traffic Infractions and fees - For a…	Title: Bicycle law\nBody: \n Traffic In…	[-0.024673462, -0.034729004, 0.0418396, 0.0121…
1	Bicycle helmet requirement	Currently, there is no state law requiring hel…	Title: Bicycle helmet requirement\nBody: Curre…	[-0.019180298, -0.037384033, 0.0027389526, -0…
2	Section 21a	helmet rules by location: These are city and c…	Title: Section 21a\nBody: helmet rules by loca…	[0.031097412, 0.0007619858, -0.023010254, -0.0…
3	Section 3b	Traffic infraction - A person operating a bicy…	Title: Section 3b\nBody: Traffic infraction - …	[0.015602112, -0.016143799, 0.032958984, 0.000…

Tools

Following functions and tools will be used in the subsequent tasks.

PYTHON

1 def retrieve_documents(query: str, n=1) -> dict:
2     """
3     Function to retrieve documents a given query.
4 
5     Steps:
6     1. Embed the query
7     2. Calculate cosine similarity between the query embedding and the embeddings of the documents
8     3. Return the top n documents with the highest similarity scores
9     """
10     query_emb = co.embed(
11         texts=[query], model="embed-v4.0", input_type="search_query"
12     )
13 
14     similarity_scores = cosine_similarity(
15         [query_emb.embeddings[0]], db.embeddings.tolist()
16     )
17     similarity_scores = similarity_scores[0]
18 
19     top_indices = similarity_scores.argsort()[::-1][:n]
20     top_matches = db.iloc[top_indices]
21 
22     return {"top_matched_document": top_matches.combined}
23 
24 
25 functions_map = {
26     "retrieve_documents": retrieve_documents,
27 }
28 
29 tools = [
30     {
31         "name": "retrieve_documents",
32         "description": "given a query, retrieve documents from a database to answer user's question",
33         "parameter_definitions": {
34             "query": {"description": "query", "type": "str", "required": True}
35         },
36     }
37 ]

RAG function

PYTHON

1 def simple_rag(query, db):
2     """
3     Given user's query, retrieve top documents and generate response using documents parameter.
4     """
5     top_matched_document = retrieve_documents(query)["top_matched_document"]
6 
7     print("top_matched_document", top_matched_document)
8 
9     output = co.chat(
10         message=query, model=COHERE_MODEL, documents=[top_matched_document]
11     )
12 
13     return output.text

Agentic RAG - cohere_agent()

PYTHON

1 def cohere_agent(
2     message: str,
3     preamble: str,
4     tools: list[dict],
5     force_single_step=False,
6     verbose: bool = False,
7     temperature: float = 0.3,
8 ) -> str:
9     """
10     Function to handle multi-step tool use api.
11 
12     Args:
13         message (str): The message to send to the Cohere AI model.
14         preamble (str): The preamble or context for the conversation.
15         tools (list of dict): List of tools to use in the conversation.
16         verbose (bool, optional): Whether to print verbose output. Defaults to False.
17 
18     Returns:
19         str: The final response from the call.
20     """
21 
22     counter = 1
23 
24     response = co.chat(
25         model=COHERE_MODEL,
26         message=message,
27         preamble=preamble,
28         tools=tools,
29         force_single_step=force_single_step,
30         temperature=temperature,
31     )
32 
33     if verbose:
34         print(f"\nrunning 0th step.")
35         print(response.text)
36 
37     while response.tool_calls:
38         tool_results = []
39 
40         if verbose:
41             print(f"\nrunning {counter}th step.")
42 
43         for tool_call in response.tool_calls:
44             output = functions_map[tool_call.name](**tool_call.parameters)
45             outputs = [output]
46             tool_results.append({"call": tool_call, "outputs": outputs})
47 
48             if verbose:
49                 print(
50                     f"= running tool {tool_call.name}, with parameters: \n{tool_call.parameters}"
51                 )
52                 print(f"== tool results:")
53                 pprint(output)
54 
55         response = co.chat(
56             model=COHERE_MODEL,
57             message="",
58             chat_history=response.chat_history,
59             preamble=preamble,
60             tools=tools,
61             force_single_step=force_single_step,
62             tool_results=tool_results,
63             temperature=temperature,
64         )
65 
66         if verbose:
67             print(response.text)
68             counter += 1
69 
70     return response.text

Question 1 - single-stage retrieval

Here we are asking a question that can be answered easily with single-stage retrieval. Both regular and agentic RAG should be able to answer this question easily. Below is the comparsion of the response.

Question	Simple Rag	Agentic Rag
Is there a state level law for wearing helmets?	There is currently no state law requiring the use of helmets when riding a bicycle. However, some cities and counties do require helmet use.	There is currently no state law requiring helmet use. However, some cities and counties do require helmet use with bicycles.

PYTHON

1 question1 = "Is there a state level law for wearing helmets?"

Simple RAG

PYTHON

1 output = simple_rag(question1, db)
2 print(output)

Output

top_matched_document 1    Title: Bicycle helmet requirement\nBody: Curre...
Name: combined, dtype: object
There is currently no state law requiring the use of helmets when riding a bicycle. However, some cities and counties do require helmet use.

Agentic RAG

PYTHON

1 preamble = """
2 You are an expert assistant that helps users answers question about legal documents and policies.
3 Use the provided documents to answer questions about an employee's specific situation.
4 """
5 
6 output = cohere_agent(question1, preamble, tools, verbose=True)

Output

running 0th step.
I will search for 'state level law for wearing helmets' in the documents provided and write an answer based on what I find.
running 1th step.
= running tool retrieve_documents, with parameters:
{'query': 'state level law for wearing helmets'}
== tool results:
{'top_matched_document': 1    Title: Bicycle helmet requirement\nBody: Curre...
Name: combined, dtype: object}
There is currently no state law requiring helmet use. However, some cities and counties do require helmet use with bicycles.

Question 2 - double-stage retrieval

The second question requires a double-stage retrieval because top matched document references another document. You will see below that the agentic RAG is unable to produce the correct answer initially. But when given proper tools and instructions, it finds the correct answer.

Question	Simple Rag	Agentic Rag
I live in orting, do I need to wear a helmet with a bike?	In the state of Washington, there is no law requiring you to wear a helmet when riding a bike. However, some cities and counties do require helmet use, so it is worth checking your local laws.	Yes, you do need to wear a helmet with a bike in Orting if you are under 17.

PYTHON

1 question2 = "I live in orting, do I need to wear a helmet with a bike?"

Simple RAG

PYTHON

1 output = simple_rag(question2, db)
2 print(output)

Output

top_matched_document 1    Title: Bicycle helmet requirement\nBody: Curre...
Name: combined, dtype: object
In the state of Washington, there is no law requiring you to wear a helmet when riding a bike. However, some cities and counties do require helmet use, so it is worth checking your local laws.

Agentic RAG

Produces same quality answer as the simple rag.

PYTHON

1 preamble = """
2 You are an expert assistant that helps users answers question about legal documents and policies.
3 Use the provided documents to answer questions about an employee's specific situation.
4 """
5 
6 output = cohere_agent(question2, preamble, tools, verbose=True)

Output

running 0th step.
I will search for 'helmet with a bike' and then write an answer.
running 1th step.
= running tool retrieve_documents, with parameters:
{'query': 'helmet with a bike'}
== tool results:
{'top_matched_document': 1    Title: Bicycle helmet requirement\nBody: Curre...
Name: combined, dtype: object}
There is no state law requiring helmet use, however, some cities and counties do require helmet use with bicycles. I cannot find any information about Orting specifically, but you should check with your local authority.

Agentic RAG - New Tools

In order for the model to retrieve correct documents, we do two things:

New reference_extractor() function is added. This function finds the references to other documents when given query and documents.
We update the instruction that directs the agent to keep retrieving relevant documents.

PYTHON

1 def reference_extractor(query: str, documents: list[str]) -> str:
2     """
3     Given a query and document, find references to other documents.
4     """
5     prompt = f"""
6     # instruction
7     Does the reference document mention any other documents? If so, list them.
8     If not, return empty string.
9 
10     # user query
11     {query}
12 
13     # retrieved documents
14     {documents}
15     """
16 
17     return co.chat(message=prompt, model=COHERE_MODEL, preamble=None).text
18 
19 
20 def retrieve_documents(query: str, n=1) -> dict:
21     """
22     Function to retrieve most relevant documents a given query.
23     It also returns other references mentioned in the top matched documents.
24     """
25     query_emb = co.embed(
26         texts=[query], model="embed-v4.0", input_type="search_query"
27     )
28 
29     similarity_scores = cosine_similarity(
30         [query_emb.embeddings[0]], db.embeddings.tolist()
31     )
32     similarity_scores = similarity_scores[0]
33 
34     top_indices = similarity_scores.argsort()[::-1][:n]
35     top_matches = db.iloc[top_indices]
36     other_references = reference_extractor(query, top_matches.combined.tolist())
37 
38     return {
39         "top_matched_document": top_matches.combined,
40         "other_references_to_query": other_references,
41     }
42 
43 
44 functions_map = {
45     "retrieve_documents": retrieve_documents,
46 }
47 
48 tools = [
49     {
50         "name": "retrieve_documents",
51         "description": "given a query, retrieve documents from a database to answer user's question. It also finds references to other documents that should be leveraged to retrieve more documents",
52         "parameter_definitions": {
53             "query": {
54                 "description": "user's question or question or name of other document sections or references.",
55                 "type": "str",
56                 "required": True,
57             }
58         },
59     }
60 ]

PYTHON

1 preamble2 = """# Instruction
2 You are an expert assistant that helps users answer questions about legal documents and policies.
3 
4 Please follow these steps:
5 1. Using user's query, use `retrieve_documents` tool to retrieve the most relevant document from the database.
6 2. If you see `other_references_to_query` in the tool result, search the mentioned referenced using `retrieve_documents(<other reference="">)` tool to retrieve more documents.
7 3. Keep trying until you find the answer.
8 4. Answer with yes or no as much as you can to answer the question directly.
9 """
10 
11 output = cohere_agent(question2, preamble2, tools, verbose=True)

Output

running 0th step.
I will search for 'Orting' and 'bike helmet' to find the relevant information.
running 1th step.
= running tool retrieve_documents, with parameters:
{'query': 'Orting bike helmet'}
== tool results:
{'other_references_to_query': 'Section 21a, Section 3b',
    'top_matched_document': 0    Title: Bicycle law\nBody: \n        Riding on ...
Name: combined, dtype: object}
I have found that there is no state law requiring helmet use, but some cities and counties do require helmets. I will now search for 'Section 21a' to find out if Orting is one of these cities or counties.
running 2th step.
= running tool retrieve_documents, with parameters:
{'query': 'Section 21a'}
== tool results:
{'other_references_to_query': '- Section 3b',
    'top_matched_document': 2    Title: Section 21a\nBody: helmet rules by loca...
Name: combined, dtype: object}
Yes, you do need to wear a helmet when riding a bike in Orting if you are under 17.