Agentic RAG for PDFs with mixed data

Shaan DesaiShaan Desai

Motivation

Retrieval-augmented generation (RAG) allows language models to generate grounded answers to questions about documents. However, the complexity of the documents can significantly influence overall RAG performance. For instance, the documents may be PDFs that contain a mix of text and tables.

More broadly, the implementation of a RAG pipeline - including parsing and chunking of documents, along with the embedding and retrieval of the chunks - is critical to the accuracy of grounded answers. Additionally, it is sometimes not sufficient to merely retrieve the answers; a user may want further postprocessing performed on the output. This use case would benefit from giving the model access to tools.

Objective

In this notebook, we will guide you through best practices for setting up a RAG pipeline to process documents that contain both tables and text. We will also demonstrate how to create a ReAct agent with a Cohere model, and then give the agent access to a RAG pipeline tool to improve accuracy. The general structure of the notebook is as follows:

  • individual components around parsing, retrieval and generation are covered for documents with mixed tabular and textual data
  • a class object is created that can be used to instantiate the pipeline with parametric input
  • the RAG pipeline is then used as a tool for a Cohere ReACT agent

Reference Documents

We recommend the following notebook as a guide to semi-structured RAG.

We also recommend the following notebook to explore various parsing techniques for PDFs.

Various LangChain-supported parsers can be found here.

Install Dependencies

PYTHON
1# there may be other dependencies that will need installation
2# ! pip install --quiet langchain langchain_cohere langchain_experimental
3# !pip --quiet install faiss-cpu tiktoken
4# !pip install pypdf
5# !pip install pytesseract
6# !pip install opencv-python --upgrade
7# !pip install "unstructured[all-docs]"
8# !pip install chromadb
PYTHON
1# LLM
2import os
3from langchain.text_splitter import RecursiveCharacterTextSplitter
4from langchain_community.document_loaders import WebBaseLoader
5from langchain_community.vectorstores import FAISS
6from langchain_cohere import CohereEmbeddings
7from pydantic import BaseModel
8from unstructured.partition.pdf import partition_pdf
9from langchain_community.document_loaders import PyPDFLoader
10import os
11from typing import Any
12import uuid
13from langchain.retrievers.multi_vector import MultiVectorRetriever
14from langchain.storage import InMemoryStore
15from langchain_community.vectorstores import Chroma
16from langchain_core.documents import Document
17import cohere, json
18import pandas as pd
19from datasets import load_dataset
20from joblib import Parallel, delayed
21
22os.environ['COHERE_API_KEY'] = ""

Parsing

To improve RAG performance on PDFs with mixed types (text and tables), we investigated a number of parsing and chunking strategies from various libraries:

We have found that the best option for parsing is unstructured.io since the parser can:

  • separate tables from text
  • automatically chunk the tables and text by title during the parsing step so that similar elements are grouped
PYTHON
1# UNSTRUCTURED pdf loader
2# Get elements
3raw_pdf_elements = partition_pdf(
4 filename="city_ny_popular_fin_report.pdf",
5 # Unstructured first finds embedded image blocks
6 extract_images_in_pdf=False,
7 # Use layout model (YOLOX) to get bounding boxes (for tables) and find titles
8 # Titles are any sub-section of the document
9 infer_table_structure=True,
10 # Post processing to aggregate text once we have the title
11 chunking_strategy="by_title",
12 # Chunking params to aggregate text blocks
13 # Attempt to create a new chunk 3800 chars
14 # Attempt to keep chunks > 2000 chars
15 max_characters=4000,
16 new_after_n_chars=3800,
17 combine_text_under_n_chars=2000,
18 image_output_dir_path='.',
19)
PYTHON
1# extract table and textual objects from parser
2class Element(BaseModel):
3 type: str
4 text: Any
5
6# Categorize by type
7categorized_elements = []
8for element in raw_pdf_elements:
9 if "unstructured.documents.elements.Table" in str(type(element)):
10 categorized_elements.append(Element(type="table", text=str(element)))
11 elif "unstructured.documents.elements.CompositeElement" in str(type(element)):
12 categorized_elements.append(Element(type="text", text=str(element)))
13
14# Tables
15table_elements = [e for e in categorized_elements if e.type == "table"]
16print(len(table_elements))
17
18# Text
19text_elements = [e for e in categorized_elements if e.type == "text"]
20print(len(text_elements))
Output
14
24

Vector Store Setup

There are many options for setting up a vector store. Here, we show how to do so using Chroma and Langchain’s Multi-vector retrieval. As the name implies, multi-vector retrieval allows us to store multiple vectors per document; for instance, for a single document chunk, one could keep embeddings for both the chunk itself, and a summary of that document. A summary may be able to distill more accurately what a chunk is about, leading to better retrieval.

You can read more about this here: https://python.langchain.com/docs/modules/data_connection/retrievers/multi_vector/

Below, we demonstrate the following process:

  • summaries of each chunk are embedded
  • during inference, the multi-vector retrieval returns the full context document related to the summary
PYTHON
1co = cohere.Client()
2def get_chat_output(message, preamble, chat_history, model, temp, documents=None):
3 return co.chat(
4 message=message,
5 preamble=preamble,
6 chat_history=chat_history,
7 documents=documents,
8 model=model,
9 temperature=temp
10 ).text
11
12def parallel_proc_chat(prompts,preamble,chat_history=None,model='command-r-plus',temp=0.1,n_jobs=10):
13 """Parallel processing of chat endpoint calls."""
14 responses = Parallel(n_jobs=n_jobs, prefer="threads")(delayed(get_chat_output)(prompt,preamble,chat_history,model,temp) for prompt in prompts)
15 return responses
16
17def rerank_cohere(query, returned_documents,model:str="rerank-multilingual-v3.0",top_n:int=3):
18 response = co.rerank(
19 query=query,
20 documents=returned_documents,
21 top_n=top_n,
22 model=model,
23 return_documents=True
24 )
25 top_chunks_after_rerank = [results.document.text for results in response.results]
26 return top_chunks_after_rerank
PYTHON
1# generate table and text summaries
2prompt_text = """You are an assistant tasked with summarizing tables and text. \
3Give a concise summary of the table or text. Table or text chunk: {element}. Only provide the summary and no other text."""
4
5table_prompts = [prompt_text.format(element=i.text) for i in table_elements]
6table_summaries = parallel_proc_chat(table_prompts,None)
7text_prompts = [prompt_text.format(element=i.text) for i in text_elements]
8text_summaries = parallel_proc_chat(text_prompts,None)
9tables = [i.text for i in table_elements]
10texts = [i.text for i in text_elements]
PYTHON
1# The vectorstore to use to index the child chunks
2vectorstore = Chroma(collection_name="summaries", embedding_function=CohereEmbeddings())
3# The storage layer for the parent documents
4store = InMemoryStore()
5id_key = "doc_id"
6# The retriever (empty to start)
7retriever = MultiVectorRetriever(
8 vectorstore=vectorstore,
9 docstore=store,
10 id_key=id_key,
11)
12# Add texts
13doc_ids = [str(uuid.uuid4()) for _ in texts]
14summary_texts = [
15 Document(page_content=s, metadata={id_key: doc_ids[i]})
16 for i, s in enumerate(text_summaries)
17]
18retriever.vectorstore.add_documents(summary_texts)
19retriever.docstore.mset(list(zip(doc_ids, texts)))
20# Add tables
21table_ids = [str(uuid.uuid4()) for _ in tables]
22summary_tables = [
23 Document(page_content=s, metadata={id_key: table_ids[i]})
24 for i, s in enumerate(table_summaries)
25]
26retriever.vectorstore.add_documents(summary_tables)
27retriever.docstore.mset(list(zip(table_ids, tables)))

RAG Pipeline

With our database in place, we can run queries against it. The query process can be broken down into the following steps:

  • augment the query, this really helps retrieve all the relevant information
  • use each augmented query to retrieve the top k docs and then rerank them
  • concatenate all the shortlisted/reranked docs and pass them to the generation model
PYTHON
1def process_query(query, retriever):
2 """Runs query augmentation, retrieval, rerank and final generation in one call."""
3 augmented_queries=co.chat(message=query,model='command-r-plus',temperature=0.2, search_queries_only=True)
4 #augment queries
5 if augmented_queries.search_queries:
6 reranked_docs=[]
7 for itm in augmented_queries.search_queries:
8 docs=retriever.invoke(itm.text)
9 temp_rerank = rerank_cohere(itm.text,docs)
10 reranked_docs.extend(temp_rerank)
11 documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))]
12 else:
13 #no queries will be run through RAG
14 documents = None
15
16 preamble = """
17## Task & Context
18You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging.
19
20## Style Guide
21Unless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.
22"""
23 model = 'command-r-plus'
24 temp = 0.2
25
26
27
28 response = co.chat(
29 message=query,
30 documents=documents,
31 preamble=preamble,
32 model=model,
33 temperature=temp
34 )
35
36 final_answer_docs="""The final answer is from the documents below:
37
38 {docs}""".format(docs=str(response.documents))
39
40 final_answer = response.text
41 return final_answer, final_answer_docs

Example

We can now test out a query. In this example, the final answer can be found on page 12 of the PDF, which aligns with the response provided by the model:

PYTHON
1query = "what are the charges for services in 2022"
2final_answer, final_answer_docs = process_query(query, retriever)
3print(final_answer)
4print(final_answer_docs)
5
6
7chat_history=[{'role':"USER", 'message':query},{'role':"CHATBOT", 'message':f'The final answer is: {final_answer}.' + final_answer_docs}]
Output
The charges for services in 2022 were $5,266 million.
The final answer is from the documents below:
[{'id': 'doc_0', 'snippet': 'Program and General Revenues FY 2023 FY 2022 FY 2021 Category (in millions) Charges for Services (CS) $5,769 $5,266 $5,669 Operating Grants and Contributions (OGC) 27,935 31,757 28,109 Capital Grants and Contributions (CGC) 657 656 675 Real Estate Taxes (RET) 31,502 29,507 31,421 Sales and Use Taxes (SUT) 10,577 10,106 7,614 Personal Income Taxes (PIT) 15,313 15,520 15,795 Income Taxes, Other (ITO) 13,181 9,521 9,499 Other Taxes* (OT) 3,680 3,777 2,755 Investment Income* (II) 694 151 226 Unrestricted Federal and State Aid (UFSA) 234 549 108 Other* (O) Total Program and General Revenues - Primary Government 2,305 $110,250 $107,535 $104,176 708 725', 'title': 'chunk 0'}]

Chat History Management

In the example below, we ask a follow up question that relies on the chat history, but does not require a rerun of the RAG pipeline.

We detect questions that do not require RAG by examining the search_queries object returned by calling co.chat to generate candidate queries to answer our question. If this object is empty, then the model has determined that a document query is not needed to answer the question.

In the example below, the else statement is invoked based on query2. We still pass in the chat history, allowing the question to be answered with only the prior context.

PYTHON
1query2='divide this by two'
2augmented_queries=co.chat(message=query2,model='command-r-plus',temperature=0.2, search_queries_only=True)
3if augmented_queries.search_queries:
4 print('RAG is needed')
5 final_answer, final_answer_docs = process_query(query, retriever)
6 print(final_answer)
7else:
8 print('RAG is not needed')
9 response = co.chat(
10 message=query2,
11 model='command-r-plus',
12 chat_history=chat_history,
13 temperature=0.3
14 )
15
16 print("Final answer:")
17 print(response.text)
Output
RAG is not needed
Final answer:
The result of dividing the charges for services in 2022 by two is $2,633.

RAG Pipeline Class

Here, we connect all of the pieces discussed above into one class object, which is then used as a tool for a Cohere ReAct agent. This class definition consolidates and clarify the key parameters used to define the RAG pipeline.

PYTHON
1co = cohere.Client()
PYTHON
1class Element(BaseModel):
2 type: str
3 text: Any
4
5class RAG_pipeline():
6 def __init__(self,paths):
7 self.embedding_model="embed-english-v3.0"
8 self.generation_model="command-r-plus"
9 self.summary_model="command-r-plus"
10 self.rerank_model="rerank-multilingual-v3.0"
11 self.num_docs_to_retrieve = 10
12 self.top_k_rerank=3
13 self.temperature=0.2
14 self.preamble="""
15## Task & Context
16You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging.
17
18## Style Guide
19Unless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.
20"""
21 self.n_jobs=10 #number of parallel processes to run summarization of chunks
22 self.extract_images_in_pdf=False
23 self.infer_table_structure=True
24 self.chunking_strategy="by_title"
25 self.max_characters=4000
26 self.new_after_n_chars=3800
27 self.combine_text_under_n_chars=2000
28 self.image_output_dir_path='.'
29 self.paths = paths
30 self.parse_and_build_retriever()
31
32 def parse_and_build_retriever(self,):
33 #step1, parse pdfs
34 # if condition just for debugging since perf_audit.pdf is parsed in the prev step, no need to rerun
35 parsed_pdf_list=self.parse_pdfs(self.paths)
36 #separate tables and text
37 extracted_tables, extracted_text = self.extract_text_and_tables(parsed_pdf_list)
38 #generate summaries for everything
39 tables, table_summaries, texts, text_summaries=self.generate_summaries(extracted_tables,extracted_text)
40 self.tables = tables
41 self.table_summaries = table_summaries
42 self.texts = texts
43 self.text_summaries=text_summaries
44 #setup the multivector retriever
45 self.make_retriever(tables, table_summaries, texts, text_summaries)
46
47 def extract_text_and_tables(self,parsed_pdf_list):
48 # extract table and textual objects from parser
49 # Categorize by type
50 all_table_elements = []
51 all_text_elements = []
52 for raw_pdf_elements in parsed_pdf_list:
53 categorized_elements = []
54 for element in raw_pdf_elements:
55 if "unstructured.documents.elements.Table" in str(type(element)):
56 categorized_elements.append(Element(type="table", text=str(element)))
57 elif "unstructured.documents.elements.CompositeElement" in str(type(element)):
58 categorized_elements.append(Element(type="text", text=str(element)))
59
60 # Tables
61 table_elements = [e for e in categorized_elements if e.type == "table"]
62 print(len(table_elements))
63
64 # Text
65 text_elements = [e for e in categorized_elements if e.type == "text"]
66 print(len(text_elements))
67 all_table_elements.extend(table_elements)
68 all_text_elements.extend(text_elements)
69
70 return all_table_elements, all_text_elements
71
72 def parse_pdfs(self, paths):
73
74 path_raw_elements = []
75 for path in paths:
76 raw_pdf_elements = partition_pdf(
77 filename=path,
78 # Unstructured first finds embedded image blocks
79 extract_images_in_pdf=self.extract_images_in_pdf,
80 # Use layout model (YOLOX) to get bounding boxes (for tables) and find titles
81 # Titles are any sub-section of the document
82 infer_table_structure=self.infer_table_structure,
83 # Post processing to aggregate text once we have the title
84 chunking_strategy=self.chunking_strategy,
85 # Chunking params to aggregate text blocks
86 # Attempt to create a new chunk 3800 chars
87 # Attempt to keep chunks > 2000 chars
88 max_characters=self.max_characters,
89 new_after_n_chars=self.new_after_n_chars,
90 combine_text_under_n_chars=self.combine_text_under_n_chars,
91 image_output_dir_path=self.image_output_dir_path,
92 )
93 path_raw_elements.append(raw_pdf_elements)
94 print('PDFs parsed')
95 return path_raw_elements
96
97
98 def get_chat_output(self,message, preamble, model, temp):
99 # print("**message")
100 # print(message)
101
102 response=co.chat(
103 message=message,
104 preamble=preamble,
105 model=model,
106 temperature=temp
107 ).text
108 # print("**output")
109 # print(response)
110 return response
111
112 def parallel_proc_chat(self,prompts,preamble,model,temp,n_jobs):
113 """Parallel processing of chat endpoint calls."""
114 responses = Parallel(n_jobs=n_jobs, prefer="threads")(delayed(self.get_chat_output)(prompt,preamble,model,temp) for prompt in prompts)
115 return responses
116
117 def rerank_cohere(self,query, returned_documents,model, top_n):
118 response = co.rerank(
119 query=query,
120 documents=returned_documents,
121 top_n=top_n,
122 model=model,
123 return_documents=True
124 )
125 top_chunks_after_rerank = [results.document.text for results in response.results]
126 return top_chunks_after_rerank
127
128 def generate_summaries(self,table_elements,text_elements):
129 # generate table and text summaries
130
131 summarize_prompt = """You are an assistant tasked with summarizing tables and text. \
132 Give a concise summary of the table or text. Table or text chunk: {element}. Only provide the summary and no other text."""
133
134 table_prompts = [summarize_prompt.format(element=i.text) for i in table_elements]
135 table_summaries = self.parallel_proc_chat(table_prompts,self.preamble,self.summary_model,self.temperature,self.n_jobs)
136 text_prompts = [summarize_prompt.format(element=i.text) for i in text_elements]
137 text_summaries = self.parallel_proc_chat(text_prompts,self.preamble,self.summary_model,self.temperature,self.n_jobs)
138 tables = [i.text for i in table_elements]
139 texts = [i.text for i in text_elements]
140 print('summaries generated')
141 return tables, table_summaries, texts, text_summaries
142
143 def make_retriever(self,tables, table_summaries, texts, text_summaries):
144 # The vectorstore to use to index the child chunks
145 vectorstore = Chroma(collection_name="summaries", embedding_function=CohereEmbeddings())
146 # The storage layer for the parent documents
147 store = InMemoryStore()
148 id_key = "doc_id"
149 # The retriever (empty to start)
150 retriever = MultiVectorRetriever(
151 vectorstore=vectorstore,
152 docstore=store,
153 id_key=id_key,
154 search_kwargs={"k": self.num_docs_to_retrieve}
155 )
156 # Add texts
157 doc_ids = [f'text_{i}' for i in range(len(texts))]#[str(uuid.uuid4()) for _ in texts]
158 summary_texts = [
159 Document(page_content=s, metadata={id_key: doc_ids[i]})
160 for i, s in enumerate(text_summaries)
161 ]
162 retriever.vectorstore.add_documents(summary_texts,ids=doc_ids)
163 retriever.docstore.mset(list(zip(doc_ids, texts)))
164 # Add tables
165 table_ids = [f'table_{i}' for i in range(len(texts))]#[str(uuid.uuid4()) for _ in tables]
166 summary_tables = [
167 Document(page_content=s, metadata={id_key: table_ids[i]})
168 for i, s in enumerate(table_summaries)
169 ]
170 retriever.vectorstore.add_documents(summary_tables,ids=table_ids)
171 retriever.docstore.mset(list(zip(table_ids, tables)))
172 self.retriever = retriever
173 print('retriever built')
174
175 def process_query(self,query):
176 """Runs query augmentation, retrieval, rerank and generation in one call."""
177 augmented_queries=co.chat(message=query,model=self.generation_model,temperature=self.temperature, search_queries_only=True)
178 #augment queries
179 if augmented_queries.search_queries:
180 reranked_docs=[]
181 for itm in augmented_queries.search_queries:
182 docs=self.retriever.invoke(itm.text)
183 temp_rerank = self.rerank_cohere(itm.text,docs,model=self.rerank_model,top_n=self.top_k_rerank)
184 reranked_docs.extend(temp_rerank)
185 documents = [{"title": f"chunk {i}", "snippet": reranked_docs[i]} for i in range(len(reranked_docs))]
186 else:
187 documents = None
188
189 response = co.chat(
190 message=query,
191 documents=documents,
192 preamble=self.preamble,
193 model=self.generation_model,
194 temperature=self.temperature
195 )
196
197 final_answer_docs="""The final answer is from the documents below:
198
199 {docs}""".format(docs=str(response.documents))
200
201 final_answer = response.text
202 return final_answer, final_answer_docs
PYTHON
1rag_object=RAG_pipeline(paths=["city_ny_popular_fin_report.pdf"])

This function will be deprecated in a future release and unstructured will simply use the DEFAULT_MODEL from unstructured_inference.model.base to set default model name

Output
PDFs parsed
14
24
summaries generated
retriever built

Cohere ReAct Agent with RAG Tool

Finally, we build a simple agent that utilizes the RAG pipeline defined above. We do this by granting the agent access to two tools:

  • the end-to-end RAG pipeline
  • a Python interpreter

The intention behind coupling these tools is to enable the model to perform mathematical and other postprocessing operations on RAG outputs using Python.

PYTHON
1from langchain.agents import Tool
2from langchain_experimental.utilities import PythonREPL
3from langchain.agents import AgentExecutor
4from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent
5from langchain_core.prompts import ChatPromptTemplate
6from langchain_cohere.chat_models import ChatCohere
7from langchain.tools.retriever import create_retriever_tool
8from langchain_core.pydantic_v1 import BaseModel, Field
9from langchain_core.tools import tool
10
11class react_agent():
12 def __init__(self,rag_retriever,model="command-r-plus",temperature=0.2):
13 self.llm = ChatCohere(model=model, temperature=temperature)
14 self.preamble="""
15## Task & Context
16You help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging.
17
18## Style Guide
19Unless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.
20
21## Guidelines
22You are an expert who answers the user's question.
23You have access to a vectorsearch tool that will use your query to search through documents and find the relevant answer.
24You also have access to a python interpreter tool which you can use to run code for mathematical operations.
25"""
26 self.get_tools(rag_retriever)
27 self.build_agent()
28
29 def get_tools(self,rag_retriever):
30 @tool
31 def vectorsearch(query: str):
32 """Uses the query to search through a list of documents and return the most relevant documents as well as the answer."""
33 final_answer, final_answer_docs=rag_retriever.process_query(query)
34 return final_answer + final_answer_docs
35 vectorsearch.name = "vectorsearch" # use python case
36 vectorsearch.description = "Uses the query to search through a list of documents and return the most relevant documents as well as the answer."
37 class vectorsearch_inputs(BaseModel):
38 query: str = Field(description="the users query")
39 vectorsearch.args_schema = vectorsearch_inputs
40
41
42 python_repl = PythonREPL()
43 python_tool = Tool(
44 name="python_repl",
45 description="Executes python code and returns the result. The code runs in a static sandbox without interactive mode, so print output or save output to a file.",
46 func=python_repl.run,
47 )
48 python_tool.name = "python_interpreter"
49 class ToolInput(BaseModel):
50 code: str = Field(description="Python code to execute.")
51 python_tool.args_schema = ToolInput
52
53 self.alltools = [vectorsearch,python_tool]
54
55 def build_agent(self):
56 # Prompt template
57 prompt = ChatPromptTemplate.from_template("{input}")
58 # Create the ReAct agent
59 agent = create_cohere_react_agent(
60 llm=self.llm,
61 tools=self.alltools,
62 prompt=prompt,
63 )
64 self.agent_executor = AgentExecutor(agent=agent, tools=self.alltools, verbose=True,return_intermediate_steps=True)
65
66
67 def run_agent(self,query,history=None):
68 if history:
69 response=self.agent_executor.invoke({
70 "input": query,
71 "preamble": self.preamble,
72 "chat_history": history
73 })
74 else:
75 response=self.agent_executor.invoke({
76 "input": query,
77 "preamble": self.preamble,
78 })
79 return response
PYTHON
1agent_object=react_agent(rag_retriever=rag_object)
PYTHON
1step1_response=agent_object.run_agent("what are the charges for services in 2022 and 2023")
Output
> Entering new AgentExecutor chain...

I will search for the charges for services in 2022 and 2023.
{'tool_name': 'vectorsearch', 'parameters': {'query': 'charges for services in 2022 and 2023'}}
The charges for services in 2022 were $5,266 million and in 2023 were $5,769 million.The final answer is from the documents below:
[{'id': 'doc_0', 'snippet': 'Program and General Revenues FY 2023 FY 2022 FY 2021 Category (in millions) Charges for Services (CS) $5,769 $5,266 $5,669 Operating Grants and Contributions (OGC) 27,935 31,757 28,109 Capital Grants and Contributions (CGC) 657 656 675 Real Estate Taxes (RET) 31,502 29,507 31,421 Sales and Use Taxes (SUT) 10,577 10,106 7,614 Personal Income Taxes (PIT) 15,313 15,520 15,795 Income Taxes, Other (ITO) 13,181 9,521 9,499 Other Taxes* (OT) 3,680 3,777 2,755 Investment Income* (II) 694 151 226 Unrestricted Federal and State Aid (UFSA) 234 549 108 Other* (O) Total Program and General Revenues - Primary Government 2,305 $110,250 $107,535 $104,176 708 725', 'title': 'chunk 0'}]Relevant Documents: 0
Cited Documents: 0
Answer: The charges for services in 2022 were $5,266 million and in 2023 were $5,769 million.
Grounded answer: The charges for services in <co: 0="">2022</co:> were <co: 0="">$5,266 million</co:> and in <co: 0="">2023</co:> were <co: 0="">$5,769 million</co:>.
> Finished chain.

Just like earlier, we can also pass chat history to the LangChain agent to refer to for any other queries.

PYTHON
1from langchain_core.messages import HumanMessage, AIMessage
PYTHON
1chat_history=[
2HumanMessage(content=step1_response['input']),
3AIMessage(content=step1_response['output'])
4]
PYTHON
1agent_object.run_agent("what is the mean of the two values",history=chat_history)
Output
> Entering new AgentExecutor chain...
Python REPL can execute arbitrary code. Use with caution.

I will use the Python Interpreter tool to calculate the mean of the two values.
{'tool_name': 'python_interpreter', 'parameters': {'code': 'import numpy as np\n\n# Data\nvalues = [5266, 5769]\n\n# Calculate the mean\nmean_value = np.mean(values)\n\nprint(f"The mean of the two values is: {mean_value:.0f} million")'}}
The mean of the two values is: 5518 million
Relevant Documents: 0
Cited Documents: 0
Answer: The mean of the two values is 5518 million.
Grounded answer: The mean of the two values is <co: 0="">5518 million</co:>.
> Finished chain.
Output
1{'input': 'what is the mean of the two values',
2'preamble': "\n## Task &amp; Context\nYou help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user's needs as best you can, which will be wide-ranging.\n\n## Style Guide\nUnless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.\n\n## Guidelines\nYou are an expert who answers the user's question. \nYou have access to a vectorsearch tool that will use your query to search through documents and find the relevant answer.\nYou also have access to a python interpreter tool which you can use to run code for mathematical operations.\n",
3'chat_history': [HumanMessage(content='what are the charges for services in 2022 and 2023'),
4AIMessage(content='The charges for services in 2022 were $5,266 million and in 2023 were $5,769 million.')],
5'output': 'The mean of the two values is 5518 million.',
6'citations': [CohereCitation(start=30, end=42, text='5518 million', documents=[{'output': 'The mean of the two values is: 5518 million\n'}])],
7'intermediate_steps': [(AgentActionMessageLog(tool='python_interpreter', tool_input={'code': 'import numpy as np\n\n# Data\nvalues = [5266, 5769]\n\n# Calculate the mean\nmean_value = np.mean(values)\n\nprint(f"The mean of the two values is: {mean_value:.0f} million")'}, log='\nI will use the Python Interpreter tool to calculate the mean of the two values.\n{\'tool_name\': \'python_interpreter\', \'parameters\': {\'code\': \'import numpy as np\\n\\n# Data\\nvalues = [5266, 5769]\\n\\n# Calculate the mean\\nmean_value = np.mean(values)\\n\\nprint(f"The mean of the two values is: {mean_value:.0f} million")\'}}\n', message_log=[AIMessage(content='\nPlan: I will use the Python Interpreter tool to calculate the mean of the two values.\nAction: ```json\n[\n {\n "tool_name": "python_interpreter",\n "parameters": {\n "code": "import numpy as np\\n\\n# Data\\nvalues = [5266, 5769]\\n\\n# Calculate the mean\\nmean_value = np.mean(values)\\n\\nprint(f\\"The mean of the two values is: {mean_value:.0f} million\\")"\n }\n }\n]\n```')]),
8'The mean of the two values is: 5518 million\n')]}

Conclusion

As you can see, the RAG pipeline can be used as a tool for a Cohere ReAct agent. This allows the agent to access the RAG pipeline for document retrieval and generation, as well as a Python interpreter for postprocessing mathematical operations to improve accuracy. This setup can be used to improve the accuracy of grounded answers to questions about documents that contain both tables and text.

Built with