Haystack and Cohere (Integration Guide)
Haystack is an open source LLM framework in Python by deepset for building customizable, production-ready LLM applications. You can use Cohere’s /embed
, /generate
, /chat
, and /rerank
models with Haystack.
Cohere’s Haystack integration provides four components that can be used in various Haystack pipelines, including retrieval augmented generation, chat, indexing, and so forth:
- The
CohereDocumentEmbedder
: To use Cohere embedding models to index documents into vector databases. - The
CohereTextEmbedder
: To use Cohere embedding models to do embedding retrieval. - The
CohereGenerator
: To use Cohere’s text generation models. - The
CohereChatGenerator
: To use Cohere’s chat completion endpoints.
Prerequisites
To use Cohere and Haystack you will need:
- The
cohere-haystack
integration installed. To install it, runpip install cohere-haystack
If you run into any issues or want more details, see these docs. - A Cohere API Key. For more details on pricing see this page. When you create an account with Cohere, we automatically create a trial API key for you. This key will be available on the dashboard where you can copy it, and it’s in the dashboard section called “API Keys” as well.
Cohere Chat with Haystack
Haystack’s CohereChatGenerator
component enables chat completion using Cohere’s large language models (LLMs). For the latest information on Cohere Chat see these docs.
In the example below, you will need to add your Cohere API key. We suggest using an environment variable, COHERE_API_KEY
. Don’t commit API keys to source control!
You can pass additional dynamic variables to the LLM, like so:
Cohere Chat with Retrieval Augmentation
This Haystack retrieval augmented generation (RAG) pipeline passes Cohere’s documentation to a Cohere model, so it can better explain Cohere’s capabilities. In the example below, you can see the LinkContentFetcher
replacing a classic retriever. The contents of the URL are passed to our generator.
Use Cohere Models in Haystack RAG Pipelines
RAG provides an LLM with context allowing it to generate better answers. You can use any of Cohere’s models in a Haystack RAG pipeline with the CohereGenerator
.
The code sample below adds a set of documents to an InMemoryDocumentStore
, then uses those documents to answer a question. You’ll need your Cohere API key to run it.
Although these examples use an InMemoryDocumentStore
to keep things simple, Haystack supports a variety of vector database and document store options. You can use any of them in combination with Cohere models.
Cohere Embeddings with Haystack
You can use Cohere’s embedding models within your Haystack RAG pipelines. The list of all supported models can be found in Cohere’s model documentation. Set an environment variable for your COHERE_API_KEY
before running the code samples below.
Although these examples use an InMemoryDocumentStore
to keep things simple, Haystack supports a variety of vector database and document store options.
Index Documents with Haystack and Cohere Embeddings
Retrieving Documents with Haystack and Cohere Embeddings
After the indexing pipeline has added the embeddings to the document store, you can build a retrieval pipeline that gets the most relevant documents from your database. This can also form the basis of RAG pipelines, where a generator component can be added at the end.