Integrations

LlamaIndex and Cohere’s Models

Prerequisite

To use LlamaIndex and Cohere, you will need:

  • LlamaIndex Package. To install it, run:
    • pip install llama-index
    • pip install llama-index-llms-cohere (to use the Command models)
    • pip install llama-index-embeddings-cohere (to use the Embed models)
    • pip install llama-index-postprocessor-cohere-rerank (to use the Rerank models)
  • Cohere’s SDK. To install it, run pip install cohere. If you run into any issues or want more details on Cohere’s SDK, see this wiki.
  • A Cohere API Key. For more details on pricing see this page. When you create an account with Cohere, we automatically create a trial API key for you. This key will be available on the dashboard where you can copy it, and it’s in the dashboard section called “API Keys” as well.

Cohere Chat with LlamaIndex

To use Cohere’s chat functionality with LlamaIndex create a Cohere model object and call the chat function.

PYTHON
1from llama_index.llms.cohere import Cohere
2from llama_index.core.llms import ChatMessage
3
4cohere_model = Cohere(
5 api_key="COHERE_API_KEY", model="command-r-plus"
6)
7
8message = ChatMessage(role="user", content="What is 2 + 3?")
9
10response = cohere_model.chat([message])
11print(response)

Cohere Embeddings with LlamaIndex

To use Cohere’s embeddings with LlamaIndex create a Cohere Embeddings object with an embedding model from this list and call get_text_embedding.

PYTHON
1from llama_index.embeddings.cohere import CohereEmbedding
2
3embed_model = CohereEmbedding(
4 api_key="COHERE_API_KEY",
5 model_name="embed-english-v3.0",
6 input_type="search_document", # Use search_query for queries, search_document for documents
7)
8
9# Generate Embeddings
10embeddings = embed_model.get_text_embedding("Welcome to Cohere!")
11
12# Print embeddings
13print(len(embeddings))
14print(embeddings[:5])

Cohere Rerank with LlamaIndex

To use Cohere’s rerank functionality with LlamaIndex create a Cohere Rerank object and use as a node post processor.

PYTHON
1from llama_index.postprocessor.cohere_rerank import CohereRerank
2from llama_index.readers.web import (
3 SimpleWebPageReader,
4) # first, run `pip install llama-index-readers-web`
5
6# create index (we are using an example page from Cohere's docs)
7documents = SimpleWebPageReader(html_to_text=True).load_data(
8 ["https://docs.cohere.com/v2/docs/prompt-tuner"]
9) # you can replace this with any other reader or documents
10index = VectorStoreIndex.from_documents(documents=documents)
11
12# create reranker
13cohere_rerank = CohereRerank(
14 api_key="COHERE_API_KEY", model="rerank-english-v3.0", top_n=2
15)
16
17# query the index
18query_engine = index.as_query_engine(
19 similarity_top_k=10,
20 node_postprocessors=[cohere_rerank],
21)
22
23print(query_engine)
24
25# generate a response
26response = query_engine.query(
27 "What is Cohere Prompt Tuner?",
28)
29
30print(response)
31
32# To view the source documents
33from llama_index.core.response.pprint_utils import pprint_response
34
35pprint_response(response, show_source=True)

Cohere RAG with LlamaIndex

The following example uses Cohere’s chat model, embeddings and rerank functionality to generate a response based on your data.

PYTHON
1from llama_index.llms.cohere import Cohere
2from llama_index.embeddings.cohere import CohereEmbedding
3from llama_index.postprocessor.cohere_rerank import CohereRerank
4from llama_index.core import Settings
5from llama_index.core import VectorStoreIndex
6from llama_index.readers.web import (
7 SimpleWebPageReader,
8) # first, run `pip install llama-index-readers-web`
9
10# Create the embedding model
11embed_model = CohereEmbedding(
12 api_key="COHERE_API_KEY",
13 model_name="embed-english-v3.0",
14 input_type="search_query",
15)
16
17# Create the service context with the cohere model for generation and embedding model
18Settings.llm = Cohere(
19 api_key="COHERE_API_KEY", model="command-r-plus"
20)
21Settings.embed_model = embed_model
22
23# create index (we are using an example page from Cohere's docs)
24documents = SimpleWebPageReader(html_to_text=True).load_data(
25 ["https://docs.cohere.com/v2/docs/prompt-tuner"]
26) # you can replace this with any other reader or documents
27index = VectorStoreIndex.from_documents(documents=documents)
28
29# Create a cohere reranker
30cohere_rerank = CohereRerank(
31 api_key="COHERE_API_KEY", model="rerank-english-v3.0", top_n=2
32)
33
34# Create the query engine
35query_engine = index.as_query_engine(
36 node_postprocessors=[cohere_rerank]
37)
38
39# Generate the response
40response = query_engine.query("What is Cohere Prompt Tuner?")
41print(response)

Cohere Tool Use (Function Calling) with LlamaIndex

To use Cohere’s tool use functionality with LlamaIndex, you can use the FunctionTool class to create a tool that uses Cohere’s API.

PYTHON
1from llama_index.llms.cohere import Cohere
2from llama_index.core.tools import FunctionTool
3from llama_index.core.agent import FunctionCallingAgent
4
5
6# Define tools
7def multiply(a: int, b: int) -> int:
8 """Multiple two integers and returns the result integer"""
9 return a * b
10
11
12multiply_tool = FunctionTool.from_defaults(fn=multiply)
13
14
15def add(a: int, b: int) -> int:
16 """Add two integers and returns the result integer"""
17 return a + b
18
19
20add_tool = FunctionTool.from_defaults(fn=add)
21
22# Define LLM
23llm = Cohere(api_key="COHERE_API_KEY", model="command-r-plus")
24
25# Create agent
26agent = FunctionCallingAgent.from_tools(
27 [multiply_tool, add_tool],
28 llm=llm,
29 verbose=True,
30 allow_parallel_tool_calls=True,
31)
32
33# Run agent
34response = await agent.achat("What is (121 * 3) + (5 * 8)?")
Built with