LangChain

Cohere supports various integrations with LangChain, a large language model (LLM) framework which allows you to quickly create applications based on Cohere's models. This doc will guide you through how to leverage different Cohere features with LangChain.

Prerequisite

To use LangChain and Cohere you will need:

  • LangChain Package. To install it, run pip install langchain. If you run into any issues or want more details, see this doc.
  • Cohere's SDK. To install it, run pip install cohere. If you run into any issues or want more details on Cohere's SDK, see this wiki.
  • A Cohere API Key. For more details on pricing see this page. When you create an account with Cohere, we automatically create a trial API key for you. This key will be available on the dashboard where you can copy it, and it's in the dashboard section called "API Keys" as well.

Cohere Chat with LangChain

To use Cohere's basic chat functionality with LangChain, simply create a ChatCohere object and pass in the message or message history. In the example below, you will need to add your Cohere API key.

from langchain_community.chat_models import ChatCohere
from langchain_core.messages import AIMessage, HumanMessage

cohere_chat_model = ChatCohere(cohere_api_key="{API_KEY}")

# Send a chat message without chat history
current_message = [HumanMessage(content="knock knock")]
print(cohere_chat_model(current_message))

# Send a chat message with chat history, note the last message is the current user message
current_message_and_history = [
    HumanMessage(content="knock knock"),
    AIMessage(content="Who's there?"),
    HumanMessage(content="Tank") ]
print(cohere_chat_model(current_message_and_history))

Cohere Chat and RAG with LangChain

To use Cohere's retrieval augmented generation (RAG) functionality with LangChain, create a CohereRagRetriever object. Then there are a few RAG uses, discussed in the next few sections.

Using LangChain's Retrievers

In this example, we use the wikipedia retriever but any retriever supported by LangChain can be used here. In order to set up the wikipedia retriever you need to install the wikipedia python package using %pip install --upgrade --quiet wikipedia. With that done, you can execute this code to see how a retriever works:

from langchain.retrievers import CohereRagRetriever
from langchain.retrievers import WikipediaRetriever
from langchain_community.chat_models import ChatCohere

# User query we will use for the generation
user_query = "What is cohere?"
# Load the cohere chat model
cohere_chat_model = ChatCohere(cohere_api_key="{API_KEY}")
# Create the cohere rag retriever using the chat model
rag = CohereRagRetriever(llm=cohere_chat_model, connectors=[])
# Create the wikipedia retriever
wiki_retriever = WikipediaRetriever()
# Get the relevant documents from wikipedia
wiki_docs = wiki_retriever.get_relevant_documents(user_query )
# Get the cohere generation from the cohere rag retriever
docs = rag.get_relevant_documents(user_query ,source_documents=wiki_docs)
# Print the documents
for doc in docs[:-1]:
    print(doc.metadata)
    print("\n\n" + doc.page_content)
    print("\n\n" + "-" * 30 + "\n\n")
# Print the final generation 
answer = docs[-1].page_content
print(answer)
# Print the final citations 
citations = docs[-1].metadata['citations']
print(citations)

Using Documents

In this example, we take documents (which might be generated in other parts of your application) and pass them into the CohereRagRetriever object:

from langchain.retrievers import CohereRagRetriever
from langchain_community.chat_models import ChatCohere
from langchain_core.documents import Document

# Load the cohere chat model
cohere_chat_model = ChatCohere(cohere_api_key="{API_KEY}")
# Create the cohere rag retriever using the chat model
rag = CohereRagRetriever(llm=cohere_chat_model, connectors=[])
docs = rag.get_relevant_documents(
    "Does LangChain support cohere RAG?",
    source_documents=[
        Document(page_content="LangChain supports cohere RAG!", metadata={"id": "id-1"}),
        Document(page_content="The sky is blue!", metadata={"id": "id-2"}),
    ],
)
# Print the documents
for doc in docs[:-1]:
    print(doc.metadata)
    print("\n\n" + doc.page_content)
    print("\n\n" + "-" * 30 + "\n\n")
# Print the final generation 
answer = docs[-1].page_content
print(answer)
# Print the final citations 
citations = docs[-1].metadata['citations']
print(citations)

Using a Connector

In this example, we create a generation with a connector which allows us to get a generation with citations to results from the connector. We use the "web-search" connector, which is available to everyone. But if you have created your own connector in your org you can pass in its id, like so: rag = CohereRagRetriever(llm=cohere_chat_model, connectors=[{"id": "example-connector-id"}])

Here's a code sample illustrating how to use a connector:

from langchain.retrievers import CohereRagRetriever
from langchain_community.chat_models import ChatCohere
from langchain_core.documents import Document

# Load the cohere chat model
cohere_chat_model = ChatCohere(cohere_api_key="{API_KEY}")
# Create the cohere rag retriever using the chat model with the web search connector
rag = CohereRagRetriever(llm=cohere_chat_model, connectors=[{"id": "web-search"}])
docs = rag.get_relevant_documents("Who founded Cohere?")
# Print the documents
for doc in docs[:-1]:
    print(doc.metadata)
    print("\n\n" + doc.page_content)
    print("\n\n" + "-" * 30 + "\n\n")
# Print the final generation 
answer = docs[-1].page_content
print(answer)
# Print the final citations 
citations = docs[-1].metadata['citations']
print(citations)

Cohere Embeddings with LangChain

To use Cohere's Embeddings with LangChain, create a CohereEmbedding object as follows (the available cohere embedding models are listed here):

from langchain_community.embeddings import CohereEmbeddings
cohere_embeddings = CohereEmbeddings(cohere_api_key="{API_KEY}", model="embed-english-light-v3.0")
text = "This is a test document."
query_result = cohere_embeddings.embed_query(text)
print(query_result)
doc_result = cohere_embeddings.embed_documents([text])
print(doc_result)

To use these embeddings with Cohere's RAG functionality, you will need to use one of the vector DBs from this list. In this example we use chroma, so in order to run it you will need to install chroma using pip install chromadb.

from langchain.retrievers import ContextualCompressionRetriever, CohereRagRetriever
from langchain.retrievers.document_compressors import CohereRerank
from langchain_community.embeddings import CohereEmbeddings
from langchain_community.chat_models import ChatCohere
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma

user_query = "When was Cohere started?"
# Create cohere's chat model and embeddings objects
cohere_chat_model = ChatCohere(cohere_api_key="{API-KEY}")
cohere_embeddings = CohereEmbeddings(cohere_api_key="{API-KEY}")
# Load text files and split into chunks, you can also use data gathered elsewhere in your application
raw_documents = TextLoader('test.txt').load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(raw_documents)
# Create a vector store from the documents
db = Chroma.from_documents(documents, cohere_embeddings)
input_docs = db.as_retriever().get_relevant_documents(user_query)

# Create the cohere rag retriever using the chat model 
rag = CohereRagRetriever(llm=cohere_chat_model)
docs = rag.get_relevant_documents(
    user_query,
    source_documents=input_docs,
)
# Print the documents
for doc in docs[:-1]:
    print(doc.metadata)
    print("\n\n" + doc.page_content)
    print("\n\n" + "-" * 30 + "\n\n")
# Print the final generation 
answer = docs[-1].page_content
print(answer)
# Print the final citations 
citations = docs[-1].metadata['citations']
print(citations)

Cohere with LangChain and Bedrock

Prerequisite

In addition to the prerequisites above, integrating Cohere with LangChain on Bedrock also requires:

Cohere Embeddings with LangChain and Bedrock

In this example, we create embeddings for a query using Bedrock and LangChain:

from langchain_community.embeddings import BedrockEmbeddings

# Replace the profile name with the one created in the setup. 
embeddings = BedrockEmbeddings(
    credentials_profile_name="{PROFILE-NAME}",
    region_name="us-east-1",
    model_id="cohere.embed-english-v3"
)
embeddings.embed_query("This is a content of the document")

Cohere ReRank with LangChain

To use Cohere's rerank functionality with LangChain, start with instantiating a CohereRerank object as follows: cohere_rerank = CohereRerank(cohere_api_key="{API_KEY}").

You can then use it with LangChain retrievers, embeddings, and RAG. The example below uses the vector DB chroma, for which you will need to install pip install chromadb. Other vector DB's from this list can also be used.

from langchain.retrievers import ContextualCompressionRetriever, CohereRagRetriever
from langchain.retrievers.document_compressors import CohereRerank
from langchain_community.embeddings import CohereEmbeddings
from langchain_community.chat_models import ChatCohere
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma

user_query =  "When was Cohere started?"
# Create cohere's chat model and embeddings objects
cohere_chat_model = ChatCohere(cohere_api_key="{API_KEY}")
cohere_embeddings = CohereEmbeddings(cohere_api_key="{API_KEY}")
# Load text files and split into chunks, you can also use data gathered elsewhere in your application
raw_documents = TextLoader('demofile.txt').load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(raw_documents)
# Create a vector store from the documents
db = Chroma.from_documents(documents, cohere_embeddings)

# Create Cohere's reranker with the vector DB using Cohere's embeddings as the base retriever
cohere_rerank = CohereRerank(cohere_api_key="{API_KEY}")
compression_retriever = ContextualCompressionRetriever(
    base_compressor=cohere_rerank, 
    base_retriever=db.as_retriever()
)
compressed_docs = compression_retriever.get_relevant_documents(user_query)
# Print the relevant documents from using the embeddings and reranker
print(compressed_docs)

# Create the cohere rag retriever using the chat model 
rag = CohereRagRetriever(llm=cohere_chat_model)
docs = rag.get_relevant_documents(
    user_query,
    source_documents=compressed_docs,
)
# Print the documents
for doc in docs[:-1]:
    print(doc.metadata)
    print("\n\n" + doc.page_content)
    print("\n\n" + "-" * 30 + "\n\n")
# Print the final generation 
answer = docs[-1].page_content
print(answer)
# Print the final citations 
citations = docs[-1].metadata['citations']
print(citations)