Cohere Chat on LangChain (Integration Guide)

Cohere supports various integrations with LangChain, a large language model (LLM) framework which allows you to quickly create applications based on Cohere’s models. This doc will guide you through how to leverage Cohere Chat with LangChain.

Prerequisites

Running Cohere Chat with LangChain doesn’t require many prerequisites, consult the top-level document for more information.

Cohere Chat with LangChain

To use Cohere chat with LangChain, simply create a ChatCohere object and pass in the message or message history. In the example below, you will need to add your Cohere API key.

PYTHON
1from langchain_cohere import ChatCohere
2from langchain_core.messages import AIMessage, HumanMessage
3
4# Define the Cohere LLM
5llm = ChatCohere(
6 cohere_api_key="COHERE_API_KEY", model="command-r-plus-08-2024"
7)
8
9# Send a chat message without chat history
10current_message = [HumanMessage(content="knock knock")]
11print(llm.invoke(current_message))
12
13# Send a chat message with chat history, note the last message is the current user message
14current_message_and_history = [
15 HumanMessage(content="knock knock"),
16 AIMessage(content="Who's there?"),
17 HumanMessage(content="Tank"),
18]
19print(llm.invoke(current_message_and_history))

Cohere Agents with LangChain

LangChain Agents use a language model to choose a sequence of actions to take.

To use Cohere’s multi hop agent create a create_cohere_react_agent and pass in the LangChain tools you would like to use.

For example, using an internet search tool to get essay writing advice from Cohere with citations:

PYTHON
1from langchain_cohere import ChatCohere
2from langchain_cohere.react_multi_hop.agent import (
3 create_cohere_react_agent,
4)
5from langchain.agents import AgentExecutor
6from langchain_community.tools.tavily_search import (
7 TavilySearchResults,
8)
9from langchain_core.prompts import ChatPromptTemplate
10
11# Internet search tool - you can use any tool, and there are lots of community tools in LangChain.
12# To use the Tavily tool you will need to set an API key in the TAVILY_API_KEY environment variable.
13os.environ["TAVILY_API_KEY"] = "TAVILY_API_KEY"
14internet_search = TavilySearchResults()
15
16# Define the Cohere LLM
17llm = ChatCohere(
18 cohere_api_key="COHERE_API_KEY", model="command-r-plus-08-2024"
19)
20
21# Create an agent
22agent = create_cohere_react_agent(
23 llm=llm,
24 tools=[internet_search],
25 prompt=ChatPromptTemplate.from_template("{question}"),
26)
27
28# Create an agent executor
29agent_executor = AgentExecutor(
30 agent=agent, tools=[internet_search], verbose=True
31)
32
33# Generate a response
34response = agent_executor.invoke(
35 {
36 "question": "I want to write an essay. Any tips?",
37 }
38)
39
40# See Cohere's response
41print(response.get("output"))
42# Cohere provides exact citations for the sources it used
43print(response.get("citations"))

Cohere Chat and RAG with LangChain

To use Cohere’s retrieval augmented generation (RAG) functionality with LangChain, create a CohereRagRetriever object. Then there are a few RAG uses, discussed in the next few sections.

Using LangChain’s Retrievers

In this example, we use the wikipedia retriever but any retriever supported by LangChain can be used here. In order to set up the wikipedia retriever you need to install the wikipedia python package using %pip install --upgrade --quiet wikipedia. With that done, you can execute this code to see how a retriever works:

PYTHON
1from langchain_cohere import CohereRagRetriever
2from langchain.retrievers import WikipediaRetriever
3from langchain_cohere import ChatCohere
4
5# User query we will use for the generation
6user_query = "What is cohere?"
7# Define the Cohere LLM
8llm = ChatCohere(
9 cohere_api_key="COHERE_API_KEY", model="command-r-plus-08-2024"
10)
11# Create the Cohere rag retriever using the chat model
12rag = CohereRagRetriever(llm=llm, connectors=[])
13# Create the wikipedia retriever
14wiki_retriever = WikipediaRetriever()
15# Get the relevant documents from wikipedia
16wiki_docs = wiki_retriever.invoke(user_query)
17# Get the cohere generation from the cohere rag retriever
18docs = rag.invoke(user_query, documents=wiki_docs)
19# Print the documents
20print("Documents:")
21for doc in docs[:-1]:
22 print(doc.metadata)
23 print("\n\n" + doc.page_content)
24 print("\n\n" + "-" * 30 + "\n\n")
25# Print the final generation
26answer = docs[-1].page_content
27print("Answer:")
28print(answer)
29# Print the final citations
30citations = docs[-1].metadata["citations"]
31print("Citations:")
32print(docs[-1].__dict__)

Using Documents

In this example, we take documents (which might be generated in other parts of your application) and pass them into the CohereRagRetriever object:

PYTHON
1from langchain_cohere import CohereRagRetriever
2from langchain_cohere import ChatCohere
3from langchain_core.documents import Document
4
5# Define the Cohere LLM
6llm = ChatCohere(
7 cohere_api_key="COHERE_API_KEY", model="command-r-plus-08-2024"
8)
9
10# Create the Cohere rag retriever using the chat model
11rag = CohereRagRetriever(llm=llm, connectors=[])
12docs = rag.invoke(
13 "Does LangChain support cohere RAG?",
14 documents=[
15 Document(
16 page_content="LangChain supports cohere RAG!",
17 metadata={"id": "id-1"},
18 ),
19 Document(
20 page_content="The sky is blue!", metadata={"id": "id-2"}
21 ),
22 ],
23)
24
25# Print the documents
26print("Documents:")
27for doc in docs[:-1]:
28 print(doc.metadata)
29 print("\n\n" + doc.page_content)
30 print("\n\n" + "-" * 30 + "\n\n")
31# Print the final generation
32answer = docs[-1].page_content
33print("Answer:")
34print(answer)
35# Print the final citations
36citations = docs[-1].metadata["citations"]
37print("Citations:")
38print(citations)

Using a Connector

In this example, we create a generation with a connector which allows us to get a generation with citations to results from the connector. We use the “web-search” connector, which is available to everyone. But if you have created your own connector in your org you can pass in its id, like so: rag = CohereRagRetriever(llm=cohere_chat_model, connectors=[{"id": "example-connector-id"}])

Here’s a code sample illustrating how to use a connector:

PYTHON
1from langchain_cohere import CohereRagRetriever
2from langchain_cohere import ChatCohere
3from langchain_core.documents import Document
4
5# Define the Cohere LLM
6llm = ChatCohere(
7 cohere_api_key="COHERE_API_KEY", model="command-r-plus-08-2024"
8)
9
10# Create the Cohere rag retriever using the chat model with the web search connector
11rag = CohereRagRetriever(llm=llm, connectors=[{"id": "web-search"}])
12docs = rag.invoke("Who founded Cohere?")
13# Print the documents
14print("Documents:")
15for doc in docs[:-1]:
16 print(doc.metadata)
17 print("\n\n" + doc.page_content)
18 print("\n\n" + "-" * 30 + "\n\n")
19# Print the final generation
20answer = docs[-1].page_content
21print("Answer:")
22print(answer)
23# Print the final citations
24citations = docs[-1].metadata["citations"]
25print("Citations:")
26print(citations)

Using the create_stuff_documents_chain Chain

This chain takes a list of documents and formats them all into a prompt, then passes that prompt to an LLM. It passes ALL documents, so you should make sure it fits within the context window of the LLM you are using.

Note: this feature is currently in beta.

PYTHON
1from langchain_cohere import ChatCohere
2from langchain_core.documents import Document
3from langchain_core.prompts import ChatPromptTemplate
4from langchain.chains.combine_documents import (
5 create_stuff_documents_chain,
6)
7
8prompt = ChatPromptTemplate.from_messages(
9 [("human", "What are everyone's favorite colors:\n\n{context}")]
10)
11
12# Define the Cohere LLM
13llm = ChatCohere(
14 cohere_api_key="COHERE_API_KEY", model="command-r-plus-08-2024"
15)
16
17chain = create_stuff_documents_chain(llm, prompt)
18
19docs = [
20 Document(page_content="Jesse loves red but not yellow"),
21 Document(
22 page_content="Jamal loves green but not as much as he loves orange"
23 ),
24]
25
26chain.invoke({"context": docs})

Structured Output Generation

Cohere supports generating JSON objects to structure and organize the model’s responses in a way that can be used in downstream applications.

You can specify the response_format parameter to indicate that you want the response in a JSON object format.

PYTHON
1from langchain_cohere import ChatCohere
2
3# Define the Cohere LLM
4llm = ChatCohere(
5 cohere_api_key="COHERE_API_KEY", model="command-r-plus-08-2024"
6)
7
8res = llm.invoke(
9 "John is five years old",
10 response_format={
11 "type": "json_object",
12 "schema": {
13 "title": "Person",
14 "description": "Identifies the age and name of a person",
15 "type": "object",
16 "properties": {
17 "name": {
18 "type": "string",
19 "description": "Name of the person",
20 },
21 "age": {
22 "type": "number",
23 "description": "Age of the person",
24 },
25 },
26 "required": [
27 "name",
28 "age",
29 ],
30 },
31 },
32)
33
34print(res)

Text Summarization

You can use the load_summarize_chain chain to perform text summarization.

PYTHON
1from langchain_cohere import ChatCohere
2from langchain.chains.summarize import load_summarize_chain
3from langchain_community.document_loaders import WebBaseLoader
4
5loader = WebBaseLoader("https://docs.cohere.com/docs/cohere-toolkit")
6docs = loader.load()
7
8# Define the Cohere LLM
9llm = ChatCohere(
10 cohere_api_key="COHERE_API_KEY",
11 model="command-r-plus-08-2024",
12 temperature=0,
13)
14
15chain = load_summarize_chain(llm, chain_type="stuff")
16
17chain.invoke({"input_documents": docs})

Using LangChain on Private Deployments

You can use LangChain with privately deployed Cohere models. To use it, specify your model deployment URL in the base_url parameter.

PYTHON
1llm = ChatCohere(
2 base_url="<YOUR_DEPLOYMENT_URL>",
3 cohere_api_key="COHERE_API_KEY",
4 model="MODEL_NAME",
5)
Built with