Routing Queries to Data Sources

Imagine a RAG system that can search over diverse sources, such as a website, a database, and a set of documents.

In a standard RAG setting, the application would aggregate retrieved documents from all the different sources it is connected to. This may contribute to noise from less relevant documents.

Additionally, it doesn’t take into consideration that, given a data source’s nature, it might be less or more relevant to a query than the other data sources.

An agentic RAG system can solve this problem by routing queries to the most relevant tools based on the query’s nature. This is done by leveraging the tool use capabilities of the Chat endpoint.

In this tutorial, we’ll cover:

Setting up the tools
Running an agentic RAG workflow
Routing queries to tools

We’ll build an agent that can answer questions about using Cohere, equipped with a number of different tools.

Setup

To get started, first we need to install the cohere library and create a Cohere client.

We also need to import the tool definitions that we’ll use in this tutorial.

Important: the source code for tool definitions can be found here. Make sure to have the tool_def.py file in the same directory as this notebook for the imports to work correctly.

PYTHON

1 ! pip install cohere langchain langchain-community pydantic -qq

PYTHON

1 import json
2 import os
3 import cohere
4 
5 from tool_def import (
6     search_developer_docs,
7     search_developer_docs_tool,
8     search_internet,
9     search_internet_tool,
10     search_code_examples,
11     search_code_examples_tool,
12 )
13 
14 co = cohere.ClientV2(
15     "COHERE_API_KEY"
16 )  # Get your free API key: https://dashboard.cohere.com/api-keys
17 
18 os.environ["TAVILY_API_KEY"] = (
19     "TAVILY_API_KEY"  # We'll need the Tavily API key to perform internet search. Get your API key: https://app.tavily.com/home
20 )

Setting up the tools

In an agentic RAG system, each data source is represented as a tool. A tool is broadly any function or service that can receive and send objects to the LLM. But in the case of RAG, this becomes a more specific case of a tool that takes a query as input and returns a set of documents.

Here, we are defining a Python function for each tool, but more broadly, the tool can be any function or service that can receive and send objects.

search_developer_docs: Searches Cohere developer documentation. Here we are creating a small list of sample documents for simplicity and will return the same list for every query. In practice, you will want to implement a search function such as those that use semantic search.
search_internet: Performs an internet search using Tavily search, which we take from LangChain’s ready implementation.
search_code_examples: Searches for Cohere code examples and tutorials. Here we are also creating a small list of sample documents for simplicity.

These functions are mapped to a dictionary called functions_map for easy access.

Here, we are defining a Python function for each tool.

Running an agentic RAG workflow

We can now run an agentic RAG workflow using a tool use approach. We can think of the system as consisting of four components:

The user
The application
The LLM
The tools

At its most basic, these four components interact in a workflow through four steps:

Step 1: Get user message – The LLM gets the user message (via the application)
Step 2: Tool planning and calling – The LLM makes a decision on the tools to call (if any) and generates the tool calls
Step 3: Tool execution - The application executes the tools and sends the results to the LLM
Step 4: Response and citation generation – The LLM generates the response and citations to back to the user

We wrap all these steps in a function called run_agent.

PYTHON

1 tools = [
2     search_developer_docs_tool,
3     search_internet_tool,
4     search_code_examples_tool,
5 ]

PYTHON

1 system_message = """## Task and Context
2 You are an assistant who helps developers use Cohere. You are equipped with a number of tools that can provide different types of information. If you can't find the information you need from one tool, you should try other tools if there is a possibility that they could provide the information you need."""

PYTHON

1 model = "command-a-03-2025"
2 
3 
4 def run_agent(query, messages=None):
5     if messages is None:
6         messages = []
7 
8     if "system" not in {m.get("role") for m in messages}:
9         messages.append({"role": "system", "content": system_message})
10 
11     # Step 1: get user message
12     print(f"QUESTION:\n{query}")
13     print("=" * 50)
14 
15     messages.append({"role": "user", "content": query})
16 
17     # Step 2: Generate tool calls (if any)
18     response = co.chat(
19         model=model, messages=messages, tools=tools, temperature=0.3
20     )
21 
22     while response.message.tool_calls:
23 
24         print("TOOL PLAN:")
25         print(response.message.tool_plan, "\n")
26         print("TOOL CALLS:")
27         for tc in response.message.tool_calls:
28             print(
29                 f"Tool name: {tc.function.name} | Parameters: {tc.function.arguments}"
30             )
31         print("=" * 50)
32 
33         messages.append(
34             {
35                 "role": "assistant",
36                 "tool_calls": response.message.tool_calls,
37                 "tool_plan": response.message.tool_plan,
38             }
39         )
40 
41         # Step 3: Get tool results
42         for tc in response.message.tool_calls:
43             tool_result = functions_map[tc.function.name](
44                 **json.loads(tc.function.arguments)
45             )
46             tool_content = []
47             for data in tool_result:
48                 tool_content.append(
49                     {
50                         "type": "document",
51                         "document": {"data": json.dumps(data)},
52                     }
53                 )
54                 # Optional: add an "id" field in the "document" object, otherwise IDs are auto-generated
55             messages.append(
56                 {
57                     "role": "tool",
58                     "tool_call_id": tc.id,
59                     "content": tool_content,
60                 }
61             )
62 
63         # Step 4: Generate response and citations
64         response = co.chat(
65             model=model,
66             messages=messages,
67             tools=tools,
68             temperature=0.3,
69         )
70 
71     messages.append(
72         {
73             "role": "assistant",
74             "content": response.message.content[0].text,
75         }
76     )
77 
78     # Print final response
79     print("RESPONSE:")
80     print(response.message.content[0].text)
81     print("=" * 50)
82 
83     # Print citations (if any)
84     verbose_source = (
85         False  # Change to True to display the contents of a source
86     )
87     if response.message.citations:
88         print("CITATIONS:\n")
89         for citation in response.message.citations:
90             print(
91                 f"Start: {citation.start}| End:{citation.end}| Text:'{citation.text}' "
92             )
93             print("Sources:")
94             for idx, source in enumerate(citation.sources):
95                 print(f"{idx+1}. {source.id}")
96                 if verbose_source:
97                     print(f"{source.tool_output}")
98             print("\n")
99 
100     return messages

Routing queries to tools

Let’s ask the agent a few questions, starting with this one about the Embed endpoint.

Because the question asks about a specific feature, the agent decides to use the search_developer_docs tool (instead of retrieving from all the data sources it’s connected to).

It first generates a tool plan that describes how it will handle the query. Then, it generates tool calls to the search_developer_docs tool with the associated query parameter.

The tool does indeed contain the information asked by the user, which the agent then uses to generate its response.

PYTHON

1 messages = run_agent("How many languages does Embed support?")

1 QUESTION:
2 How many languages does Embed support?
3 ==================================================
4 TOOL PLAN:
5 I will search the Cohere developer documentation for 'how many languages does Embed support'. 
6 
7 TOOL CALLS:
8 Tool name: search_developer_docs | Parameters: {"query":"how many languages does Embed support"}
9 ==================================================
10 RESPONSE:
11 The Embed endpoint supports over 100 languages.
12 ==================================================
13 CITATIONS:
14 
15 Start: 28| End:47| Text:'over 100 languages.' 
16 Sources:
17 1. search_developer_docs_gwt5g55gjc3w:2

Let’s now ask the agent a question about setting up the Notion API so we can connect it to LLMs. This information is not likely to be found in the developer documentation or code examples because it is not Cohere-specific, so we can expect the agent to use the internet search tool.

And this is exactly what the agent does. This time, it decides to use the search_internet tool, triggers the search through Tavily search, and uses the results to generate its response.

PYTHON

1 messages = run_agent("How to set up the Notion API.")

1 QUESTION:
2 How to set up the Notion API.
3 ==================================================
4 TOOL PLAN:
5 I will search for 'Notion API setup' to find out how to set up the Notion API. 
6 
7 TOOL CALLS:
8 Tool name: search_internet | Parameters: {"query":"Notion API setup"}
9 ==================================================
10 RESPONSE:
11 To set up the Notion API, you need to create a new integration in Notion's integrations dashboard. You can do this by navigating to https://www.notion.com/my-integrations and clicking '+ New integration'.
12 
13 Once you've done this, you'll need to get your API secret by visiting the Configuration tab. You should keep your API secret just that – a secret! You can refresh your secret if you accidentally expose it.
14 
15 Next, you'll need to give your integration page permissions. To do this, you'll need to pick or create a Notion page, then click on the ... More menu in the top-right corner of the page. Scroll down to + Add Connections, then search for your integration and select it. You'll then need to confirm the integration can access the page and all of its child pages.
16 
17 If your API requests are failing, you should confirm you have given the integration permission to the page you are trying to update.
18 
19 You can also create a Notion API integration and get your internal integration token. You'll then need to create a .env file and add environmental variables, get your Notion database ID and add your integration to your database.
20 
21 For more information on what you can build with Notion's API, you can refer to this guide.
22 ==================================================
23 CITATIONS:
24 
25 Start: 38| End:62| Text:'create a new integration' 
26 Sources:
27 1. search_internet_cwabyfc5mn8c:0
28 2. search_internet_cwabyfc5mn8c:2
29 
30 
31 Start: 75| End:98| Text:'integrations dashboard.' 
32 Sources:
33 1. search_internet_cwabyfc5mn8c:2
34 
35 
36 Start: 132| End:170| Text:'https://www.notion.com/my-integrations' 
37 Sources:
38 1. search_internet_cwabyfc5mn8c:0
39 
40 
41 Start: 184| End:203| Text:''+ New integration'' 
42 Sources:
43 1. search_internet_cwabyfc5mn8c:0
44 2. search_internet_cwabyfc5mn8c:2
45 
46 
47 Start: 244| End:263| Text:'get your API secret' 
48 Sources:
49 1. search_internet_cwabyfc5mn8c:2
50 
51 
52 Start: 280| End:298| Text:'Configuration tab.' 
53 Sources:
54 1. search_internet_cwabyfc5mn8c:2
55 
56 
57 Start: 310| End:351| Text:'keep your API secret just that – a secret' 
58 Sources:
59 1. search_internet_cwabyfc5mn8c:2
60 
61 
62 Start: 361| End:411| Text:'refresh your secret if you accidentally expose it.' 
63 Sources:
64 1. search_internet_cwabyfc5mn8c:2
65 
66 
67 Start: 434| End:473| Text:'give your integration page permissions.' 
68 Sources:
69 1. search_internet_cwabyfc5mn8c:2
70 
71 
72 Start: 501| End:529| Text:'pick or create a Notion page' 
73 Sources:
74 1. search_internet_cwabyfc5mn8c:2
75 
76 
77 Start: 536| End:599| Text:'click on the ... More menu in the top-right corner of the page.' 
78 Sources:
79 1. search_internet_cwabyfc5mn8c:2
80 
81 
82 Start: 600| End:632| Text:'Scroll down to + Add Connections' 
83 Sources:
84 1. search_internet_cwabyfc5mn8c:2
85 
86 
87 Start: 639| End:681| Text:'search for your integration and select it.' 
88 Sources:
89 1. search_internet_cwabyfc5mn8c:2
90 
91 
92 Start: 702| End:773| Text:'confirm the integration can access the page and all of its child pages.' 
93 Sources:
94 1. search_internet_cwabyfc5mn8c:2
95 
96 
97 Start: 783| End:807| Text:'API requests are failing' 
98 Sources:
99 1. search_internet_cwabyfc5mn8c:2
100 
101 
102 Start: 820| End:907| Text:'confirm you have given the integration permission to the page you are trying to update.' 
103 Sources:
104 1. search_internet_cwabyfc5mn8c:2
105 
106 
107 Start: 922| End:953| Text:'create a Notion API integration' 
108 Sources:
109 1. search_internet_cwabyfc5mn8c:1
110 
111 
112 Start: 958| End:994| Text:'get your internal integration token.' 
113 Sources:
114 1. search_internet_cwabyfc5mn8c:1
115 
116 
117 Start: 1015| End:1065| Text:'create a .env file and add environmental variables' 
118 Sources:
119 1. search_internet_cwabyfc5mn8c:1
120 
121 
122 Start: 1067| End:1094| Text:'get your Notion database ID' 
123 Sources:
124 1. search_internet_cwabyfc5mn8c:1
125 
126 
127 Start: 1099| End:1137| Text:'add your integration to your database.' 
128 Sources:
129 1. search_internet_cwabyfc5mn8c:1
130 
131 
132 Start: 1223| End:1229| Text:'guide.' 
133 Sources:
134 1. search_internet_cwabyfc5mn8c:3

Let’s ask the agent a final question, this time about tutorials that are relevant for enterprises.

Again, the agent uses the context of the query to decide on the most relevant tool. In this case, it selects the search_code_examples tool and provides a response based on the information found.

PYTHON

1 messages = run_agent(
2     "Any tutorials that are relevant for enterprises?"
3 )

1 QUESTION:
2 Any tutorials that are relevant for enterprises?
3 ==================================================
4 TOOL PLAN:
5 I will search for 'enterprise tutorials' in the code examples and tutorials tool. 
6 
7 TOOL CALLS:
8 Tool name: search_code_examples | Parameters: {"query":"enterprise tutorials"}
9 ==================================================
10 RESPONSE:
11 I found a tutorial called 'Advanced Document Parsing For Enterprises'.
12 ==================================================
13 CITATIONS:
14 
15 Start: 26| End:69| Text:''Advanced Document Parsing For Enterprises'' 
16 Sources:
17 1. search_code_examples_jhh40p32wxpw:4

Summary

In this tutorial, we learned about:

How to set up tools in an agentic RAG system
How to run an agentic RAG workflow
How to automatically route queries to the most relevant data sources

However, so far we have only seen rather simple queries. In practice, we may run into a complex query that needs to simplified, optimized, or split (etc.) before we can perform the retrieval.

In Part 2, we’ll learn how to build an agentic RAG system that can expand user queries into parallel queries.

1	functions_map = {
2	"search_developer_docs": search_developer_docs,
3	"search_internet": search_internet,
4	"search_code_examples": search_code_examples,
5	}