Routing Queries to Data Sources

Open in Colab

Imagine a RAG system that can search over diverse sources, such as a website, a database, and a set of documents.

In a standard RAG setting, the application would aggregate retrieved documents from all the different sources it is connected to. This may contribute to noise from less relevant documents.

Additionally, it doesn’t take into consideration that, given a data source’s nature, it might be less or more relevant to a query than the other data sources.

An agentic RAG system can solve this problem by routing queries to the most relevant tools based on the query’s nature. This is done by leveraging the tool use capabilities of the Chat endpoint.

In this tutorial, we’ll cover:

  • Setting up the tools
  • Running an agentic RAG workflow
  • Routing queries to tools

We’ll build an agent that can answer questions about using Cohere, equipped with a number of different tools.

Setup

To get started, first we need to install the cohere library and create a Cohere client.

We also need to import the tool definitions that we’ll use in this tutorial.

Important: the source code for tool definitions can be found here. Make sure to have the tool_def.py file in the same directory as this notebook for the imports to work correctly.
PYTHON
1! pip install cohere langchain langchain-community pydantic -qq
PYTHON
1import json
2import os
3import cohere
4
5from tool_def import (
6 search_developer_docs,
7 search_developer_docs_tool,
8 search_internet,
9 search_internet_tool,
10 search_code_examples,
11 search_code_examples_tool,
12)
13
14co = cohere.ClientV2(
15 "COHERE_API_KEY"
16) # Get your free API key: https://dashboard.cohere.com/api-keys
17
18os.environ["TAVILY_API_KEY"] = (
19 "TAVILY_API_KEY" # We'll need the Tavily API key to perform internet search. Get your API key: https://app.tavily.com/home
20)

Setting up the tools

In an agentic RAG system, each data source is represented as a tool. A tool is broadly any function or service that can receive and send objects to the LLM. But in the case of RAG, this becomes a more specific case of a tool that takes a query as input and returns a set of documents.

Here, we are defining a Python function for each tool, but more broadly, the tool can be any function or service that can receive and send objects.

  • search_developer_docs: Searches Cohere developer documentation. Here we are creating a small list of sample documents for simplicity and will return the same list for every query. In practice, you will want to implement a search function such as those that use semantic search.
  • search_internet: Performs an internet search using Tavily search, which we take from LangChain’s ready implementation.
  • search_code_examples: Searches for Cohere code examples and tutorials. Here we are also creating a small list of sample documents for simplicity.

These functions are mapped to a dictionary called functions_map for easy access.

Here, we are defining a Python function for each tool.

Further reading:

PYTHON
1functions_map = {
2 "search_developer_docs": search_developer_docs,
3 "search_internet": search_internet,
4 "search_code_examples": search_code_examples,
5}

The second and final setup step is to define the tool schemas in a format that can be passed to the Chat endpoint. A tool schema must contain the following fields: name, description, and parameters in the format shown below.

This schema informs the LLM about what the tool does, which enables an LLM to decide whether to use a particular tool. Therefore, the more descriptive and specific the schema, the more likely the LLM will make the right tool call decisions.

Running an agentic RAG workflow

We can now run an agentic RAG workflow using a tool use approach. We can think of the system as consisting of four components:

  • The user
  • The application
  • The LLM
  • The tools

At its most basic, these four components interact in a workflow through four steps:

  • Step 1: Get user message – The LLM gets the user message (via the application)
  • Step 2: Tool planning and calling – The LLM makes a decision on the tools to call (if any) and generates the tool calls
  • Step 3: Tool execution - The application executes the tools and sends the results to the LLM
  • Step 4: Response and citation generation – The LLM generates the response and citations to back to the user

We wrap all these steps in a function called run_agent.

PYTHON
1tools = [
2 search_developer_docs_tool,
3 search_internet_tool,
4 search_code_examples_tool,
5]
PYTHON
1system_message = """## Task and Context
2You are an assistant who helps developers use Cohere. You are equipped with a number of tools that can provide different types of information. If you can't find the information you need from one tool, you should try other tools if there is a possibility that they could provide the information you need."""
PYTHON
1model = "command-a-03-2025"
2
3
4def run_agent(query, messages=None):
5 if messages is None:
6 messages = []
7
8 if "system" not in {m.get("role") for m in messages}:
9 messages.append({"role": "system", "content": system_message})
10
11 # Step 1: get user message
12 print(f"QUESTION:\n{query}")
13 print("=" * 50)
14
15 messages.append({"role": "user", "content": query})
16
17 # Step 2: Generate tool calls (if any)
18 response = co.chat(
19 model=model, messages=messages, tools=tools, temperature=0.3
20 )
21
22 while response.message.tool_calls:
23
24 print("TOOL PLAN:")
25 print(response.message.tool_plan, "\n")
26 print("TOOL CALLS:")
27 for tc in response.message.tool_calls:
28 print(
29 f"Tool name: {tc.function.name} | Parameters: {tc.function.arguments}"
30 )
31 print("=" * 50)
32
33 messages.append(
34 {
35 "role": "assistant",
36 "tool_calls": response.message.tool_calls,
37 "tool_plan": response.message.tool_plan,
38 }
39 )
40
41 # Step 3: Get tool results
42 for tc in response.message.tool_calls:
43 tool_result = functions_map[tc.function.name](
44 **json.loads(tc.function.arguments)
45 )
46 tool_content = []
47 for data in tool_result:
48 tool_content.append(
49 {
50 "type": "document",
51 "document": {"data": json.dumps(data)},
52 }
53 )
54 # Optional: add an "id" field in the "document" object, otherwise IDs are auto-generated
55 messages.append(
56 {
57 "role": "tool",
58 "tool_call_id": tc.id,
59 "content": tool_content,
60 }
61 )
62
63 # Step 4: Generate response and citations
64 response = co.chat(
65 model=model,
66 messages=messages,
67 tools=tools,
68 temperature=0.3,
69 )
70
71 messages.append(
72 {
73 "role": "assistant",
74 "content": response.message.content[0].text,
75 }
76 )
77
78 # Print final response
79 print("RESPONSE:")
80 print(response.message.content[0].text)
81 print("=" * 50)
82
83 # Print citations (if any)
84 verbose_source = (
85 False # Change to True to display the contents of a source
86 )
87 if response.message.citations:
88 print("CITATIONS:\n")
89 for citation in response.message.citations:
90 print(
91 f"Start: {citation.start}| End:{citation.end}| Text:'{citation.text}' "
92 )
93 print("Sources:")
94 for idx, source in enumerate(citation.sources):
95 print(f"{idx+1}. {source.id}")
96 if verbose_source:
97 print(f"{source.tool_output}")
98 print("\n")
99
100 return messages

Routing queries to tools

Let’s ask the agent a few questions, starting with this one about the Embed endpoint.

Because the question asks about a specific feature, the agent decides to use the search_developer_docs tool (instead of retrieving from all the data sources it’s connected to).

It first generates a tool plan that describes how it will handle the query. Then, it generates tool calls to the search_developer_docs tool with the associated query parameter.

The tool does indeed contain the information asked by the user, which the agent then uses to generate its response.

PYTHON
1messages = run_agent("How many languages does Embed support?")
1QUESTION:
2How many languages does Embed support?
3==================================================
4TOOL PLAN:
5I will search the Cohere developer documentation for 'how many languages does Embed support'.
6
7TOOL CALLS:
8Tool name: search_developer_docs | Parameters: {"query":"how many languages does Embed support"}
9==================================================
10RESPONSE:
11The Embed endpoint supports over 100 languages.
12==================================================
13CITATIONS:
14
15Start: 28| End:47| Text:'over 100 languages.'
16Sources:
171. search_developer_docs_gwt5g55gjc3w:2

Let’s now ask the agent a question about setting up the Notion API so we can connect it to LLMs. This information is not likely to be found in the developer documentation or code examples because it is not Cohere-specific, so we can expect the agent to use the internet search tool.

And this is exactly what the agent does. This time, it decides to use the search_internet tool, triggers the search through Tavily search, and uses the results to generate its response.

PYTHON
1messages = run_agent("How to set up the Notion API.")
1QUESTION:
2How to set up the Notion API.
3==================================================
4TOOL PLAN:
5I will search for 'Notion API setup' to find out how to set up the Notion API.
6
7TOOL CALLS:
8Tool name: search_internet | Parameters: {"query":"Notion API setup"}
9==================================================
10RESPONSE:
11To set up the Notion API, you need to create a new integration in Notion's integrations dashboard. You can do this by navigating to https://www.notion.com/my-integrations and clicking '+ New integration'.
12
13Once you've done this, you'll need to get your API secret by visiting the Configuration tab. You should keep your API secret just that – a secret! You can refresh your secret if you accidentally expose it.
14
15Next, you'll need to give your integration page permissions. To do this, you'll need to pick or create a Notion page, then click on the ... More menu in the top-right corner of the page. Scroll down to + Add Connections, then search for your integration and select it. You'll then need to confirm the integration can access the page and all of its child pages.
16
17If your API requests are failing, you should confirm you have given the integration permission to the page you are trying to update.
18
19You can also create a Notion API integration and get your internal integration token. You'll then need to create a .env file and add environmental variables, get your Notion database ID and add your integration to your database.
20
21For more information on what you can build with Notion's API, you can refer to this guide.
22==================================================
23CITATIONS:
24
25Start: 38| End:62| Text:'create a new integration'
26Sources:
271. search_internet_cwabyfc5mn8c:0
282. search_internet_cwabyfc5mn8c:2
29
30
31Start: 75| End:98| Text:'integrations dashboard.'
32Sources:
331. search_internet_cwabyfc5mn8c:2
34
35
36Start: 132| End:170| Text:'https://www.notion.com/my-integrations'
37Sources:
381. search_internet_cwabyfc5mn8c:0
39
40
41Start: 184| End:203| Text:''+ New integration''
42Sources:
431. search_internet_cwabyfc5mn8c:0
442. search_internet_cwabyfc5mn8c:2
45
46
47Start: 244| End:263| Text:'get your API secret'
48Sources:
491. search_internet_cwabyfc5mn8c:2
50
51
52Start: 280| End:298| Text:'Configuration tab.'
53Sources:
541. search_internet_cwabyfc5mn8c:2
55
56
57Start: 310| End:351| Text:'keep your API secret just that – a secret'
58Sources:
591. search_internet_cwabyfc5mn8c:2
60
61
62Start: 361| End:411| Text:'refresh your secret if you accidentally expose it.'
63Sources:
641. search_internet_cwabyfc5mn8c:2
65
66
67Start: 434| End:473| Text:'give your integration page permissions.'
68Sources:
691. search_internet_cwabyfc5mn8c:2
70
71
72Start: 501| End:529| Text:'pick or create a Notion page'
73Sources:
741. search_internet_cwabyfc5mn8c:2
75
76
77Start: 536| End:599| Text:'click on the ... More menu in the top-right corner of the page.'
78Sources:
791. search_internet_cwabyfc5mn8c:2
80
81
82Start: 600| End:632| Text:'Scroll down to + Add Connections'
83Sources:
841. search_internet_cwabyfc5mn8c:2
85
86
87Start: 639| End:681| Text:'search for your integration and select it.'
88Sources:
891. search_internet_cwabyfc5mn8c:2
90
91
92Start: 702| End:773| Text:'confirm the integration can access the page and all of its child pages.'
93Sources:
941. search_internet_cwabyfc5mn8c:2
95
96
97Start: 783| End:807| Text:'API requests are failing'
98Sources:
991. search_internet_cwabyfc5mn8c:2
100
101
102Start: 820| End:907| Text:'confirm you have given the integration permission to the page you are trying to update.'
103Sources:
1041. search_internet_cwabyfc5mn8c:2
105
106
107Start: 922| End:953| Text:'create a Notion API integration'
108Sources:
1091. search_internet_cwabyfc5mn8c:1
110
111
112Start: 958| End:994| Text:'get your internal integration token.'
113Sources:
1141. search_internet_cwabyfc5mn8c:1
115
116
117Start: 1015| End:1065| Text:'create a .env file and add environmental variables'
118Sources:
1191. search_internet_cwabyfc5mn8c:1
120
121
122Start: 1067| End:1094| Text:'get your Notion database ID'
123Sources:
1241. search_internet_cwabyfc5mn8c:1
125
126
127Start: 1099| End:1137| Text:'add your integration to your database.'
128Sources:
1291. search_internet_cwabyfc5mn8c:1
130
131
132Start: 1223| End:1229| Text:'guide.'
133Sources:
1341. search_internet_cwabyfc5mn8c:3

Let’s ask the agent a final question, this time about tutorials that are relevant for enterprises.

Again, the agent uses the context of the query to decide on the most relevant tool. In this case, it selects the search_code_examples tool and provides a response based on the information found.

PYTHON
1messages = run_agent(
2 "Any tutorials that are relevant for enterprises?"
3)
1QUESTION:
2Any tutorials that are relevant for enterprises?
3==================================================
4TOOL PLAN:
5I will search for 'enterprise tutorials' in the code examples and tutorials tool.
6
7TOOL CALLS:
8Tool name: search_code_examples | Parameters: {"query":"enterprise tutorials"}
9==================================================
10RESPONSE:
11I found a tutorial called 'Advanced Document Parsing For Enterprises'.
12==================================================
13CITATIONS:
14
15Start: 26| End:69| Text:''Advanced Document Parsing For Enterprises''
16Sources:
171. search_code_examples_jhh40p32wxpw:4

Summary

In this tutorial, we learned about:

  • How to set up tools in an agentic RAG system
  • How to run an agentic RAG workflow
  • How to automatically route queries to the most relevant data sources

However, so far we have only seen rather simple queries. In practice, we may run into a complex query that needs to simplified, optimized, or split (etc.) before we can perform the retrieval.

In Part 2, we’ll learn how to build an agentic RAG system that can expand user queries into parallel queries.

Built with