Querying Structured Data (SQL)
In the previous tutorial, we explored how agentic RAG can handle complex queries on structured data in the form of tables using pandas. Now, we’ll see how we can do the same for SQL databases.
Consider a scenario similar to the previous tutorial where we have evaluation results for an LLM application. However, instead of a CSV file, this data is now stored in a SQLite database. Users might still ask questions like “What’s the average score for a specific use case?” or “Which configuration has the lowest latency?”, but now we’ll answer these using SQL queries instead of pandas operations.
In this tutorial, we’ll cover:
- Setting up a SQLite database
- Creating a function to execute SQL queries
- Building an agent for querying SQL databases
- Running the agent with various types of queries
By implementing these techniques, we’ll expand our agentic RAG system to handle structured data in SQL databases, complementing our previous work with tabular data in pandas.
Setup
To get started, first we need to install the cohere
library and create a Cohere client.
Creating a SQLite database
Next, we’ll create a SQLite database to store our evaluation results. SQLite is a lightweight, serverless database engine that’s perfect for small to medium-sized applications. Here’s what we’re going to do:
- Create a new SQLite database file named
evaluation_results.db
. - Create a table called
evaluation_results
with columns forusecase
,run
,score
,temperature
,tokens
, andlatency
. - Insert sample data into the table to simulate our evaluation results.
Creating a function to query a SQL database
Next, we’ll define a function called sql_table_query
that allows us to execute SQL queries on our evaluation_results database.
This function will enable us to retrieve and analyze data from our evaluation_results table, allowing for dynamic querying based on our specific needs.
We can test the function by running a simple query:
Setting up a tool to interact with the database
Next, we’ll create a tool that will allow the agent to interact with the SQLite database containing our evaluation results.
Building an agent for querying SQL data
Next, let’s create a run_agent
function to run the agentic RAG workflow, the same as in Part 1.
The only change we are making here is to make the system message more specific and describe the database schema to the agent.
Running the agent
Let’s now ask the agent the same set of questions we asked in the previous chapter. While the previous chapter translates the questions into pandas Python code, this time the agent will be using SQL queries.
Summary
In this tutorial, we learned about:
- How to set up a SQLite database for structured data
- How to create a function to execute SQL queries
- How to build an agent for querying the database
- How to run the agent
By implementing these techniques, we’ve further expanded our agentic RAG system to handle structured data in the form of SQL databases. This allows for more powerful and flexible querying capabilities, especially when dealing with large datasets or complex relationships between data.
This tutorial completes our exploration of structured data handling in the agentic RAG system, covering both tabular data (using pandas) and relational databases (using SQL). These capabilities significantly enhance the system’s ability to work with diverse data formats and structures.