Querying Structured Data (Tables)

Open in Colab

In the previous tutorials, we explored how to build agentic RAG applications over unstructured and semi-structured data. In this tutorial and the next, we’ll turn our focus to structured data.

This tutorial focuses on querying tables, and the next tutorial will be about querying SQL databases.

Consider a scenario where you have a CSV file containing evaluation results for an LLM application.

A user might ask questions like “What’s the average score for a specific use case?” or “Which configuration has the lowest latency?“. These queries require not just retrieval, but also data analysis and interpretation.

In this tutorial, we’ll cover:

  • Creating a function to execute Python code
  • Setting up a tool to interact with tabular data
  • Building an agent for querying tabular data
  • Running the agent

Let’s get started by setting up our environment and defining the necessary tools for our agent.

Setup

To get started, first we need to install the cohere library and create a Cohere client.

PYTHON
1! pip install cohere pandas -qq
PYTHON
1import json
2import os
3import cohere
4
5co = cohere.ClientV2(
6 "COHERE_API_KEY"
7) # Get your free API key: https://dashboard.cohere.com/api-keys

And here’s the data we’ll be working with. evaluation_results.csv is a CSV file containing evaluation results for a set of LLM applications - name extraction, email drafting, and article summarization.

The file has the following columns:

  • usecase: The use case.
  • run: The run ID.
  • score: The evaluation score for a particular run.
  • temperature: The temperature setting of the model for a particular run.
  • tokens: The number of tokens generated by the model for a particular run.
  • latency: The latency of the model’s response for a particular run.
Important: the data can be found here. Make sure to have the file in the same directory as this notebook for the imports to work correctly.
PYTHON
1import pandas as pd
2
3df = pd.read_csv("evaluation_results.csv")
4
5df.head()
usecaserunscoretemperaturetokenslatency
0extract_namesA0.50.31031.12
1draft_emailA0.60.32522.50
2summarize_articleA0.80.33504.20
3extract_namesB0.20.31012.85
4draft_emailB0.40.32303.20

Creating a function to execute Python code

Here, we introduce a new tool that allows the agent to execute Python code and return the result. The agent will use this tool to generate pandas code for querying the data.

To create this tool, we’ll use the PythonREPL class from the langchain_experimental.utilities module. This class provides a sandboxed environment for executing Python code and returns the result.

First, we define a python_tool that uses the PythonREPL class to execute Python code and return the result.

Next, we define a ToolInput class to handle the input for the python_tool.

Finally, we create a function analyze_evaluation_results that takes a string of Python code as input, executes the code using the Python tool we created, and returns the result.

IMPORTANT:

The source code for tool definitions can be found here. Make sure to have the tool_def.py file in the same directory as this notebook for the imports to work correctly.

PYTHON
1from tool_def import (
2 analyze_evaluation_results,
3 analyze_evaluation_results_tool,
4)
PYTHON
1functions_map = {
2 "analyze_evaluation_results": analyze_evaluation_results
3}

Setting up a tool to interact with tabular data

Next, we define the analyze_evaluation_results tool. There are many ways we can set up a tool to work with CSV data, and in this example, we are using the tool description to provide the agent with the necessary context for working with the CSV file, such as:

  • the name of the CSV file to load
  • the columns of the CSV file
  • additional instructions on what libraries to use (in this case, pandas)

The parameter of this tool is the code string containing the Python code that the agent writes to analyze the data.

PYTHON
1analyze_evaluation_results_tool = {
2 "type": "function",
3 "function": {
4 "name": "analyze_evaluation_results",
5 "description": "Generate Python code using the pandas library to analyze evaluation results from a dataframe called `evaluation_results`. The dataframe has columns 'usecase','run','score','temperature','tokens', and 'latency'. You must start with `import pandas as pd` and read a CSV file called `evaluation_results.csv` into the `evaluation_results` dataframe.",
6 "parameters": {
7 "type": "object",
8 "properties": {
9 "code": {
10 "type": "string",
11 "description": "Executable Python code",
12 }
13 },
14 "required": ["code"],
15 },
16 },
17}
PYTHON
1tools = [analyze_evaluation_results_tool]

Building an agent for querying tabular data

Next, let’s create a run_agent function to run the agentic RAG workflow, the same as in Part 1.

The only change we are making here is to make the system message simpler and more specific since the agent now only has one tool.

PYTHON
1system_message = """## Task and Context
2ou are an assistant who helps developers analyze LLM application evaluation results from a CSV files."""
PYTHON
1model = "command-a-03-2025"
2
3
4def run_agent(query, messages=None):
5 if messages is None:
6 messages = []
7
8 if "system" not in {m.get("role") for m in messages}:
9 messages.append({"role": "system", "content": system_message})
10
11 # Step 1: get user message
12 print(f"Question:\n{query}")
13 print("=" * 50)
14
15 messages.append({"role": "user", "content": query})
16
17 # Step 2: Generate tool calls (if any)
18 response = co.chat(
19 model=model, messages=messages, tools=tools, temperature=0.3
20 )
21
22 while response.message.tool_calls:
23
24 print("TOOL PLAN:")
25 print(response.message.tool_plan, "\n")
26 print("TOOL CALLS:")
27 for tc in response.message.tool_calls:
28 if tc.function.name == "analyze_evaluation_results":
29 print(f"Tool name: {tc.function.name}")
30 tool_call_prettified = print(
31 "\n".join(
32 f" {line}"
33 for line_num, line in enumerate(
34 json.loads(tc.function.arguments)[
35 "code"
36 ].splitlines()
37 )
38 )
39 )
40 print(tool_call_prettified)
41 else:
42 print(
43 f"Tool name: {tc.function.name} | Parameters: {tc.function.arguments}"
44 )
45 print("=" * 50)
46
47 messages.append(
48 {
49 "role": "assistant",
50 "tool_calls": response.message.tool_calls,
51 "tool_plan": response.message.tool_plan,
52 }
53 )
54
55 # Step 3: Get tool results
56 for tc in response.message.tool_calls:
57 tool_result = functions_map[tc.function.name](
58 **json.loads(tc.function.arguments)
59 )
60 tool_content = [
61 (
62 {
63 "type": "document",
64 "document": {"data": json.dumps(tool_result)},
65 }
66 )
67 ]
68
69 messages.append(
70 {
71 "role": "tool",
72 "tool_call_id": tc.id,
73 "content": tool_content,
74 }
75 )
76
77 # Step 4: Generate response and citations
78 response = co.chat(
79 model=model,
80 messages=messages,
81 tools=tools,
82 temperature=0.3,
83 )
84
85 messages.append(
86 {
87 "role": "assistant",
88 "content": response.message.content[0].text,
89 }
90 )
91
92 # Print final response
93 print("RESPONSE:")
94 print(response.message.content[0].text)
95 print("=" * 50)
96
97 # Print citations (if any)
98 verbose_source = (
99 False # Change to True to display the contents of a source
100 )
101 if response.message.citations:
102 print("CITATIONS:\n")
103 for citation in response.message.citations:
104 print(
105 f"Start: {citation.start}| End:{citation.end}| Text:'{citation.text}' "
106 )
107 print("Sources:")
108 for idx, source in enumerate(citation.sources):
109 print(f"{idx+1}. {source.id}")
110 if verbose_source:
111 print(f"{source.tool_output}")
112 print("\n")
113
114 return messages

Running the agent

Let’s ask the agent a few questions, starting with this query about the average evaluation score in run A.

To answer this query, the agent needs to write Python code that uses the pandas library to calculate the average evaluation score in run A. And it gets the answer right.

PYTHON
1messages = run_agent("What's the average evaluation score in run A")
2# Answer: 0.63
1Question:
2What's the average evaluation score in run A
3==================================================
4
5
6Python REPL can execute arbitrary code. Use with caution.
7
8
9TOOL PLAN:
10I will write and execute Python code to calculate the average evaluation score in run A.
11
12TOOL CALLS:
13Tool name: analyze_evaluation_results
14 import pandas as pd
15
16 df = pd.read_csv("evaluation_results.csv")
17
18 # Calculate the average evaluation score in run A
19 average_score_run_A = df[df["run"] == "A"]["score"].mean()
20
21 print(f"Average evaluation score in run A: {average_score_run_A}")
22None
23==================================================
24RESPONSE:
25The average evaluation score in run A is 0.63.
26==================================================
27CITATIONS:
28
29Start: 41| End:46| Text:'0.63.'
30Sources:
311. analyze_evaluation_results_phqpwwat2hgf:0

Next, we ask a slightly more complex question, this time about the latency of the highest-scoring run for one use case. This requires the agent to filter based on the use case, find the highest-scoring run, and return the latency value.

PYTHON
1messages = run_agent(
2 "What's the latency of the highest-scoring run for the summarize_article use case?"
3)
4# Answer: 4.8
1Question:
2What's the latency of the highest-scoring run for the summarize_article use case?
3==================================================
4TOOL PLAN:
5I will write Python code to find the latency of the highest-scoring run for the summarize_article use case.
6
7TOOL CALLS:
8Tool name: analyze_evaluation_results
9 import pandas as pd
10
11 df = pd.read_csv("evaluation_results.csv")
12
13 # Filter for the summarize_article use case
14 use_case_df = df[df["usecase"] == "summarize_article"]
15
16 # Find the highest-scoring run
17 highest_score_run = use_case_df.loc[use_case_df["score"].idxmax()]
18
19 # Get the latency of the highest-scoring run
20 latency = highest_score_run["latency"]
21
22 print(f"Latency of the highest-scoring run: {latency}")
23None
24==================================================
25RESPONSE:
26The latency of the highest-scoring run for the summarize_article use case is 4.8.
27==================================================
28CITATIONS:
29
30Start: 77| End:81| Text:'4.8.'
31Sources:
321. analyze_evaluation_results_es3hnnnp5pey:0

Next, we ask a question to compare the use cases in terms of token usage, and to show a markdown table to show the comparison.

PYTHON
1messages = run_agent(
2 "Which use case uses the least amount of tokens on average? Show the comparison of all use cases in a markdown table."
3)
4# Answer: extract_names (106.25), draft_email (245.75), summarize_article (355.75)
1Question:
2Which use case uses the least amount of tokens on average? Show the comparison of all use cases in a markdown table.
3==================================================
4TOOL PLAN:
5I will use the analyze_evaluation_results tool to generate Python code to find the use case that uses the least amount of tokens on average. I will also generate code to create a markdown table to compare all use cases.
6
7TOOL CALLS:
8Tool name: analyze_evaluation_results
9 import pandas as pd
10
11 evaluation_results = pd.read_csv("evaluation_results.csv")
12
13 # Group by 'usecase' and calculate the average tokens
14 avg_tokens_by_usecase = evaluation_results.groupby('usecase')['tokens'].mean()
15
16 # Find the use case with the least average tokens
17 least_avg_tokens_usecase = avg_tokens_by_usecase.idxmin()
18
19 print(f"Use case with the least average tokens: {least_avg_tokens_usecase}")
20
21 # Create a markdown table comparing average tokens for all use cases
22 markdown_table = avg_tokens_by_usecase.reset_index()
23 markdown_table.columns = ["Use Case", "Average Tokens"]
24 print(markdown_table.to_markdown(index=False))
25None
26==================================================
27RESPONSE:
28The use case that uses the least amount of tokens on average is extract_names.
29
30Here is a markdown table comparing the average tokens for all use cases:
31
32| Use Case | Average Tokens |
33|:-------------------------|-------------------------------:|
34| draft_email | 245.75 |
35| extract_names | 106.25 |
36| summarize_article | 355.75 |
37==================================================
38CITATIONS:
39
40Start: 64| End:78| Text:'extract_names.'
41Sources:
421. analyze_evaluation_results_zp68h5304e3v:0
43
44
45Start: 156| End:164| Text:'Use Case'
46Sources:
471. analyze_evaluation_results_zp68h5304e3v:0
48
49
50Start: 167| End:181| Text:'Average Tokens'
51Sources:
521. analyze_evaluation_results_zp68h5304e3v:0
53
54
55Start: 248| End:259| Text:'draft_email'
56Sources:
571. analyze_evaluation_results_zp68h5304e3v:0
58
59
60Start: 262| End:268| Text:'245.75'
61Sources:
621. analyze_evaluation_results_zp68h5304e3v:0
63
64
65Start: 273| End:286| Text:'extract_names'
66Sources:
671. analyze_evaluation_results_zp68h5304e3v:0
68
69
70Start: 289| End:295| Text:'106.25'
71Sources:
721. analyze_evaluation_results_zp68h5304e3v:0
73
74
75Start: 300| End:317| Text:'summarize_article'
76Sources:
771. analyze_evaluation_results_zp68h5304e3v:0
78
79
80Start: 320| End:326| Text:'355.75'
81Sources:
821. analyze_evaluation_results_zp68h5304e3v:0

Summary

In this tutorial, we learned about:

  • How to create a function to execute Python code
  • How to set up a tool to interact with tabular data
  • How to run the agent

By implementing these techniques, we’ve expanded our agentic RAG system to handle structured data in the form of tables.

While this tutorial demonstrated how to work with tabular data using pandas and Python, the agentic RAG approach can be applied to other forms of structured data as well. This means we can build agents that can translate natural language queries into various types of data analysis tasks.

In Part 6, we’ll learn how to do structured query generation for SQL databases.

Built with