Cohere Rerank on LangChain (Integration Guide)

Cohere supports various integrations with LangChain, a large language model (LLM) framework which allows you to quickly create applications based on Cohere’s models. This doc will guide you through how to leverage Rerank with LangChain.

Prerequisites

Running Cohere Rerank with LangChain doesn’t require many prerequisites, consult the top-level document for more information.

Cohere ReRank with LangChain

To use Cohere’s rerank functionality with LangChain, start with instantiating a CohereRerank object as follows: cohere_rerank = CohereRerank(cohere_api_key="{API_KEY}").

You can then use it with LangChain retrievers, embeddings, and RAG. The example below uses the vector DB chroma, for which you will need to install pip install chromadb. Other vector DB’s from this list can also be used.

PYTHON
1from langchain.retrievers import ContextualCompressionRetriever
2from langchain_cohere import CohereEmbeddings
3from langchain_cohere import ChatCohere
4from langchain_cohere import CohereRerank, CohereRagRetriever
5from langchain.text_splitter import CharacterTextSplitter
6from langchain_community.document_loaders import TextLoader
7from langchain_community.vectorstores import Chroma
8from langchain_community.document_loaders import WebBaseLoader
9
10user_query = "what is Cohere Toolkit?"
11
12# Define the Cohere LLM
13llm = ChatCohere(cohere_api_key="COHERE_API_KEY",
14 model="command-r-plus-08-2024")
15
16# Define the Cohere embedding model
17embeddings = CohereEmbeddings(cohere_api_key="COHERE_API_KEY",
18 model="embed-english-light-v3.0")
19
20# Load text files and split into chunks, you can also use data gathered elsewhere in your application
21raw_documents = WebBaseLoader("https://docs.cohere.com/docs/cohere-toolkit").load()
22text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
23documents = text_splitter.split_documents(raw_documents)
24
25# Create a vector store from the documents
26db = Chroma.from_documents(documents, embeddings)
27
28# Create Cohere's reranker with the vector DB using Cohere's embeddings as the base retriever
29reranker = CohereRerank(cohere_api_key="COHERE_API_KEY",
30 model="rerank-english-v3.0")
31
32compression_retriever = ContextualCompressionRetriever(
33 base_compressor=reranker,
34 base_retriever=db.as_retriever()
35)
36compressed_docs = compression_retriever.get_relevant_documents(user_query)
37# Print the relevant documents from using the embeddings and reranker
38print(compressed_docs)
39
40# Create the cohere rag retriever using the chat model
41rag = CohereRagRetriever(llm=llm, connectors=[])
42docs = rag.get_relevant_documents(
43 user_query,
44 documents=compressed_docs,
45)
46# Print the documents
47print("Documents:")
48for doc in docs[:-1]:
49 print(doc.metadata)
50 print("\n\n" + doc.page_content)
51 print("\n\n" + "-" * 30 + "\n\n")
52# Print the final generation
53answer = docs[-1].page_content
54print("Answer:")
55print(answer)
56# Print the final citations
57citations = docs[-1].metadata['citations']
58print("Citations:")
59print(citations)

Using LangChain on Private Deployments

You can use LangChain with privately deployed Cohere models. To use it, specify your model deployment URL in the base_url parameter.

PYTHON
1llm = CohereRerank(base_url=<YOUR_DEPLOYMENT_URL>,
2 cohere_api_key="COHERE_API_KEY",
3 model="MODEL_NAME")
Built with