Retrieval Augmented Generation (RAG)
Retrieval Augmented Generation (RAG) enables an LLM to ground its responses on external documents, thus improving the accuracy of its responses and minimizing hallucinations.
The Chat endpoint comes with built-in RAG capabilities such as document grounding and citation generation.
This quickstart guide shows you how to perform RAG with the Chat endpoint.
Setup
First, install the Cohere Python SDK with the following command.
Next, import the library and create a client.
Cohere Platform
Private Deployment
Bedrock
SageMaker
Azure AI
Documents
First, define the documents that will passed as the context for RAG. These documents are typically retrieved from sources such as vector databases via semantic search, or any system that can retrieve unstructured data given a user query.
Each document can take any number of fields e.g. title
, url
, text
, etc.