LangChain

Cohere has first class support for LangChain, a framework which enables you to quickly create LLM powered applications. This doc will guide you through how to leverage different Cohere features with LangChain.

Prerequisite

To use LangChain and Cohere with Python you will need:

  • The LangChain package - see the LangChain installation guide for more details.
  • Cohere's LangChain partner package - see the wiki page for more details.
  • A Cohere API Key. When you create an account with Cohere, we automatically create a trial API key for you.

Cohere Chat with LangChain

To use Chat with LangChain, simply create a ChatCohere object and pass in the message or message history.

from langchain_cohere import ChatCohere
from langchain_core.messages import HumanMessage

llm = ChatCohere(cohere_api_key="{API KEY}")

message = [HumanMessage(content="Hello, can you introduce yourself?")]

print(llm.invoke(message).content)

Cohere Agents with LangChain

LangChain Agents use a language model to choose a sequence of actions to take.

To use Cohere's multi hop agent create a create_cohere_react_agent and pass in the LangChain tools you would like to use.

For example, using an internet search tool to get essay writing advice from Cohere with citations:

from langchain.agents import AgentExecutor
from langchain_cohere.chat_models import ChatCohere
from langchain_cohere.react_multi_hop.agent import create_cohere_react_agent
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.prompts import ChatPromptTemplate

# Internet search tool - you can use any tool, and there are lots of community tools in LangChain.
# To use the Tavily tool you will need to set an API key in the TAVILY_API_KEY environment variable.
internet_search = TavilySearchResults()

# Create and run the Cohere agent
# Set a Cohere API key in the COHERE_API_KEY environment variable.
llm = ChatCohere()
agent = create_cohere_react_agent(
    llm=llm,
    tools=[internet_search],
    prompt=ChatPromptTemplate.from_template("{question}"),
)
agent_executor = AgentExecutor(agent=agent, tools=[internet_search], verbose=True)

response = agent_executor.invoke({
    "question": "I want to write an essay. Any tips?",
})
# See Cohere's response
print(response.get("output"))
# Cohere provides exact citations for the sources it used
print(response.get("citations"))

Cohere RAG with LangChain

To use Cohere's retrieval augmented generation (RAG) capabilities with LangChain, use CohereRagRetriever. There are a few different RAG uses, discussed in the next few sections.

Using Cohere Connectors

In this example we use Cohere Connectors to connect a data store and generate citations.

We use the "web-search" connector, which is available to everyone, but you can create a private connector to connect to your organisation's data stores.

from pprint import pprint
from langchain_cohere import ChatCohere, CohereRagRetriever

# User query we will use for the generation
user_query = "Who are Cohere?"


# Use Cohere's RAG retriever with Cohere Connectors to generate an answer.
# Cohere provides exact citations for the sources it used.
llm = ChatCohere()
rag = CohereRagRetriever(llm=llm, connectors=[{"id": "web-search"}])
docs = rag.get_relevant_documents(user_query)
answer = docs.pop()

pprint("Relevant documents:")
pprint(docs)

pprint(f"Question: {user_query}")
pprint("Answer:")
pprint(answer.page_content)
pprint(answer.metadata["citations"])

Using Documents

In this example, we take documents (which might be generated in other parts of your application) and pass them into Cohere to generate citations.

from pprint import pprint

from langchain_cohere import ChatCohere, CohereRagRetriever
from langchain_core.documents import Document


# User query we will use for the generation
user_query = "Does LangChain support Cohere RAG?"

# Use Cohere's RAG retriever in document mode to generate an answer.
# Cohere provides exact citations for the sources it used.
llm = ChatCohere()
rag = CohereRagRetriever(llm=llm, connectors=[])
docs = rag.get_relevant_documents(
    user_query,
    documents=[
        Document(page_content="LangChain supports Cohere RAG!"),
        Document(page_content="The sky is blue!"),
    ],
)
answer = docs.pop()

pprint("Relevant documents:")
pprint(docs)

pprint(f"Question: {user_query}")
pprint("Answer:")
pprint(answer.page_content)
pprint(answer.metadata["citations"])

Cohere Embeddings with LangChain

To use Cohere's Embeddings with LangChain, create a CohereEmbedding object as follows (the available cohere embedding models are listed here:

from langchain_cohere import CohereEmbeddings

cohere_embeddings = CohereEmbeddings(model="embed-english-light-v3.0")
text = "This is a test document."

query_result = cohere_embeddings.embed_query(text)
print(query_result)

doc_result = cohere_embeddings.embed_documents([text])
print(doc_result)

To use these embeddings with Cohere's RAG functionality, you will need to use one of the vector DBs from this list. In this example we use chroma, so in order to run it you will need to install chroma using pip install chromadb.

from pprint import pprint

from langchain_cohere import ChatCohere, CohereEmbeddings, CohereRagRetriever
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter

user_query = "When was Cohere started?"

# Create Cohere's chat model and embeddings objects
llm = ChatCohere()
cohere_embeddings = CohereEmbeddings()

# Load text files and split into chunks, you can also use data gathered elsewhere in your application
raw_documents = TextLoader('test.txt').load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(raw_documents)
# Create a vector store from the documents
db = Chroma.from_documents(documents, cohere_embeddings)
input_docs = db.as_retriever().get_relevant_documents(user_query)

rag = CohereRagRetriever(llm=llm)
docs = rag.get_relevant_documents(
    user_query,
    documents=input_docs,
)

answer = docs.pop()

pprint("Relevant documents:")
pprint(docs)

pprint("Answer:")
pprint(answer.page_content)
pprint(answer.metadata["citations"])

Cohere ReRank with LangChain)

To use Cohere's rerank functionality with LangChain, start with instantiating a CohereRerank object as follows: cohere_rerank = CohereRerank(cohere_api_key="{API_KEY}").

You can then use it with LangChain retrievers, embeddings, and RAG. The example below uses the vector DB chroma, for which you will need to install pip install chromadb. Other vector DB's from this list can also be used.

from pprint import pprint

from langchain_cohere import ChatCohere, CohereEmbeddings, CohereRagRetriever, CohereRerank
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
from langchain.retrievers import ContextualCompressionRetriever
from langchain.text_splitter import CharacterTextSplitter

user_query = "When was Cohere started?"

# Create Cohere's chat model, embeddings and rerank objects.
llm = ChatCohere()
cohere_embeddings = CohereEmbeddings()
cohere_rerank = CohereRerank()

# Load text files and split into chunks, you can also use data gathered elsewhere in your application.
raw_documents = TextLoader('demofile.txt').load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
documents = text_splitter.split_documents(raw_documents)
# Create a vector store from the documents
db = Chroma.from_documents(documents, cohere_embeddings)

# Create Cohere's reranker with the vector DB using Cohere's embeddings as the base retriever.
compression_retriever = ContextualCompressionRetriever(
    base_compressor=cohere_rerank,
    base_retriever=db.as_retriever()
)
compressed_docs = compression_retriever.get_relevant_documents(user_query)
# Print the relevant documents from using the embeddings and reranker
print(compressed_docs)

# Create the cohere rag retriever using the chat model
rag = CohereRagRetriever(llm=llm)
docs = rag.get_relevant_documents(
    user_query,
    documents=compressed_docs,
)
answer = docs.pop()

pprint("Relevant documents:")
pprint(docs)

pprint("Answer:")
pprint(answer.page_content)
pprint(answer.metadata["citations"])

Cohere with LangChain and Bedrock

Prerequisite

In addition to the prerequisites above, integrating Cohere with LangChain on Bedrock also requires:

Cohere Embeddings with LangChain and Bedrock

In this example, we create embeddings for a query using Bedrock and LangChain:

from langchain_community.embeddings import BedrockEmbeddings

# Replace the profile name with the one created in the setup. 
embeddings = BedrockEmbeddings(
    credentials_profile_name="{PROFILE-NAME}",
    region_name="us-east-1",
    model_id="cohere.embed-english-v3"
)
embeddings.embed_query("This is a content of the document")