Querying Structured Data (Tables)

Open in Colab

In the previous tutorials, we explored how to build agentic RAG applications over unstructured and semi-structured data. Now, we’ll see how to implement agentic RAG over structured data.

This tutorial focuses on querying tables, and the next tutorial will be about querying SQL databases.

Consider a scenario where you have a CSV file containing evaluation results for an LLM application.

A user might ask questions like “What’s the average score for a specific use case?” or “Which configuration has the lowest latency?“. These queries require not just retrieval, but also data analysis and interpretation.

In this tutorial, we’ll cover:

  • Creating a function to execute Python code
  • Setting up a tool to interact with tabular data
  • Building an agent for querying tabular data
  • Running the agent

Let’s get started by setting up our environment and defining the necessary tools for our agent.

Setup

To get started, first we need to install the cohere library and create a Cohere client.

PYTHON
1import json
2import os
3import cohere
4
5co = cohere.ClientV2("COHERE_API_KEY") # Get your free API key: https://dashboard.cohere.com/api-keys

And here’s the data we’ll be working with. evaluation_results.csv is a CSV file containing evaluation results for a set of LLM applications - name extraction, email drafting, and article summarization.

The file has the following columns:

  • usecase: The use case.
  • run: The run ID.
  • score: The evaluation score for a particular run.
  • temperature: The temperature setting of the model for a particular run.
  • tokens: The number of tokens generated by the model for a particular run.
  • latency: The latency of the model’s response for a particular run.

Important: the data can be found here. Make sure to have the file in the same directory as this notebook for the imports to work correctly.

PYTHON
1import pandas as pd
2
3df = pd.read_csv('evaluation_results.csv')
4
5df.head()
usecaserunscoretemperaturetokenslatency
0extract_namesA0.50.31031.12
1draft_emailA0.60.32522.50
2summarize_articleA0.80.33504.20
3extract_namesB0.20.31012.85
4draft_emailB0.40.32303.20

Creating a function to execute Python code

Here, we introduce a new tool that allows the agent to execute Python code and return the result. The agent will use this tool to generate pandas code to query the data.

To create this tool, we’ll use the PythonREPL class from the langchain_experimental.utilities module. This class provides a sandboxed environment for executing Python code and returns the result.

First, we define a python_tool that uses the PythonREPL class to execute Python code and return the result.

Next, we define a ToolInput class to handle the input for the python_tool.

Finally, we create a function analyze_evaluation_results that takes a string of Python code as input, executes the code using the Python tool we created, and returns the result.

Important: the source code for tool definitions can be found here. Make sure to have the tool_def.py file in the same directory as this notebook for the imports to work correctly.

PYTHON
1from tool_def import (
2 analyze_evaluation_results,
3 analyze_evaluation_results_tool,
4)
PYTHON
1functions_map = {
2 "analyze_evaluation_results": analyze_evaluation_results
3}

Setting up a tool to interact with tabular data

Next, we define the analyze_evaluation_results tool. There are many ways we can set up a tool to work with CSV data, and in this example, we are using the tool description to provide the agent with the necessary context for working with the CSV file, such as:

  • the name of the CSV file to load
  • the columns of the CSV file
  • additional instructions on what libraries to use (in this case, pandas)

The parameter of this tool is the code string containing the Python code that the agent writes to analyze the data.

PYTHON
1analyze_evaluation_results_tool = {
2 "type": "function",
3 "function": {
4 "name": "analyze_evaluation_results",
5 "description": "Generate Python code using the pandas library to analyze evaluation results from a dataframe called `evaluation_results`. The dataframe has columns 'usecase','run','score','temperature','tokens', and 'latency'. You must start with `import pandas as pd` and read a CSV file called `evaluation_results.csv` into the `evaluation_results` dataframe.",
6 "parameters": {
7 "type": "object",
8 "properties": {
9 "code": {
10 "type": "string",
11 "description": "Executable Python code"
12 }
13 },
14 "required": ["code"]
15 }
16 }
17}
PYTHON
1tools = [analyze_evaluation_results_tool]

Building an agent for querying tabular data

Next, let’s create a run_agent function to run the agentic RAG workflow, the same as in Part 1.

The only change we are making here is to make the system message simpler and more specific since the agent now only has one tool.

PYTHON
1system_message="""## Task and Context
2You are an assistant who helps developers find code examples and tutorials on using Cohere."""
PYTHON
1model = "command-r-plus-08-2024"
2
3def run_agent(query, messages=None):
4 if messages is None:
5 messages = []
6
7 if "system" not in {m.get("role") for m in messages}:
8 messages.append({"role": "system", "content": system_message})
9
10 # Step 1: get user message
11 print(f"Question:\n{query}")
12 print("="*50)
13
14 messages.append({"role": "user", "content": query})
15
16 # Step 2: Generate tool calls (if any)
17 response = co.chat(
18 model=model,
19 messages=messages,
20 tools=tools,
21 temperature=0.1
22 )
23
24 while response.message.tool_calls:
25
26 print("TOOL PLAN:")
27 print(response.message.tool_plan,"\n")
28 print("TOOL CALLS:")
29 for tc in response.message.tool_calls:
30 if tc.function.name == "analyze_evaluation_results":
31 print(f"Tool name: {tc.function.name}")
32 tool_call_prettified = print("\n".join(f" {line}" for line_num, line in enumerate(json.loads(tc.function.arguments)["code"].splitlines())))
33 print(tool_call_prettified)
34 else:
35 print(f"Tool name: {tc.function.name} | Parameters: {tc.function.arguments}")
36 print("="*50)
37
38 messages.append({"role": "assistant", "tool_calls": response.message.tool_calls, "tool_plan": response.message.tool_plan})
39
40 # Step 3: Get tool results
41 for tc in response.message.tool_calls:
42 tool_result = functions_map[tc.function.name](**json.loads(tc.function.arguments))
43 tool_content = [({"type": "document", "document": {"data": json.dumps(tool_result)}})]
44
45 messages.append({"role": "tool", "tool_call_id": tc.id, "content": tool_content})
46
47 # Step 4: Generate response and citations
48 response = co.chat(
49 model=model,
50 messages=messages,
51 tools=tools,
52 temperature=0.1
53 )
54
55 messages.append({"role": "assistant", "content": response.message.content[0].text})
56
57 # Print final response
58 print("RESPONSE:")
59 print(response.message.content[0].text)
60 print("="*50)
61
62 # Print citations (if any)
63 verbose_source = False # Change to True to display the contents of a source
64 if response.message.citations:
65 print("CITATIONS:\n")
66 for citation in response.message.citations:
67 print(f"Start: {citation.start}| End:{citation.end}| Text:'{citation.text}' ")
68 print("Sources:")
69 for idx, source in enumerate(citation.sources):
70 print(f"{idx+1}. {source.id}")
71 if verbose_source:
72 print(f"{source.tool_output}")
73 print("\n")
74
75 return messages

Running the agent

Let’s ask the agent a few questions, starting with this query about the average evaluation score in run A.

To answer this query, the agent needs to write Python code that uses the pandas library to calculate the average evaluation score in run A. And it gets the answer right.

PYTHON
1messages = run_agent("What's the average evaluation score in run A")
2# Answer: 0.63
Question:
What's the average evaluation score in run A
==================================================
TOOL PLAN:
I will use the `analyze_evaluation_results` tool to write Python code to find the average evaluation score in run A.
TOOL CALLS:
Tool name: analyze_evaluation_results
import pandas as pd
df = pd.read_csv("evaluation_results.csv")
# Filter the dataframe to only include rows where the 'run' column is 'A'
filtered_df = df[df['run'] == 'A']
# Calculate the average score for the filtered dataframe
average_score = filtered_df['score'].mean()
print(f"Average score in run A: {average_score}")
None
==================================================
RESPONSE:
The average evaluation score in run A is **0.63**.
==================================================
CITATIONS:
Start: 43| End:47| Text:'0.63'
Sources:
1. analyze_evaluation_results_3m8j8s850pf6:0

Next, we ask a slightly more complex question, this time about the latency of the highest-scoring run for one use case. This requires the agent to filter based on the use case, find the highest-scoring run, and return the latency value.

PYTHON
1messages = run_agent("What's the latency of the highest-scoring run for the summarize_article use case?")
2# Answer: 4.8
Question:
What's the latency of the highest-scoring run for the summarize_article use case?
==================================================
TOOL PLAN:
I will write Python code to find the latency of the highest-scoring run for the summarize_article use case.
TOOL CALLS:
Tool name: analyze_evaluation_results
import pandas as pd
df = pd.read_csv("evaluation_results.csv")
# Filter the dataframe to only include rows where usecase is 'summarize_article'
filtered_df = df[df['usecase'] == 'summarize_article']
# Find the highest-scoring run
highest_score_run = filtered_df.loc[filtered_df['score'].idxmax()]
# Print the latency of the highest-scoring run
print(f"The latency of the highest-scoring run for the summarize_article use case is {highest_score_run['latency']} seconds.")
None
==================================================
RESPONSE:
The latency of the highest-scoring run for the summarize_article use case is 4.8 seconds.
==================================================
CITATIONS:
Start: 77| End:89| Text:'4.8 seconds.'
Sources:
1. analyze_evaluation_results_n98b7a3hjdp3:0

Next, we ask a question to compare the use cases in terms of token usage, and to show a markdown table to show the comparison.

PYTHON
1messages = run_agent("Which use case uses the least amount of tokens on average? Show the comparison of all use cases in a markdown table.")
2# Answer: extract_names (106.25), draft_email (245.75), summarize_article (355.75)
Question:
Which use case uses the least amount of tokens on average? Show the comparison of all use cases in a markdown table.
==================================================
TOOL PLAN:
I will use the `analyze_evaluation_results` tool to write Python code to find the use case that uses the least amount of tokens on average. I will also create a markdown table to compare all use cases.
TOOL CALLS:
Tool name: analyze_evaluation_results
import pandas as pd
df = pd.read_csv("evaluation_results.csv")
# Calculate the average number of tokens for each use case
average_tokens = df.groupby("usecase")["tokens"].mean()
# Find the use case with the least average tokens
least_tokens_usecase = average_tokens.idxmin()
least_tokens_usecase_value = average_tokens.min()
# Create a markdown table to compare all use cases
markdown_table = average_tokens.reset_index().to_markdown(index=False)
print(f"The use case with the least average tokens is {least_tokens_usecase} with {least_tokens_usecase_value} tokens.")
print(f"\nComparison of all use cases:\n{markdown_table}")
None
==================================================
RESPONSE:
The use case with the least average tokens is `extract_names` with 106.25 tokens.
| usecase | tokens |
|---|---|
| draft_email | 245.75 |
| extract_names | 106.25 |
| summarize_article | 355.75 |
==================================================
CITATIONS:
Start: 46| End:61| Text:'`extract_names`'
Sources:
1. analyze_evaluation_results_5dy7vfd0zcdk:0
Start: 67| End:73| Text:'106.25'
Sources:
1. analyze_evaluation_results_5dy7vfd0zcdk:0
Start: 116| End:127| Text:'draft_email'
Sources:
1. analyze_evaluation_results_5dy7vfd0zcdk:0
Start: 130| End:136| Text:'245.75'
Sources:
1. analyze_evaluation_results_5dy7vfd0zcdk:0
Start: 141| End:154| Text:'extract_names'
Sources:
1. analyze_evaluation_results_5dy7vfd0zcdk:0
Start: 157| End:163| Text:'106.25'
Sources:
1. analyze_evaluation_results_5dy7vfd0zcdk:0
Start: 168| End:185| Text:'summarize_article'
Sources:
1. analyze_evaluation_results_5dy7vfd0zcdk:0
Start: 188| End:194| Text:'355.75'
Sources:
1. analyze_evaluation_results_5dy7vfd0zcdk:0

Summary

In this tutorial, we learned about:

  • How to create a function to execute Python code
  • How to set up a tool to interact with tabular data
  • How to run the agent

By implementing these techniques, we’ve expanded our agentic RAG system to handle structured data in the form of tables.

While this tutorial demonstrated how to work with tabular data using pandas and Python, the agentic RAG approach can be applied to other forms of structured data as well. This means we can build agents that can translate natural language queries into various data analysis tasks.

In Part 6, we’ll learn how to perform RAG over structured data (SQL).