For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
  • Get Started
    • Introduction
    • Installation
    • Creating a client
    • Playground
    • FAQs
  • Models
    • An Overview of Cohere's Models
    • Aya
    • Embed
    • Rerank
  • Text Generation
    • Introduction to Text Generation at Cohere
    • Using the Chat API
    • Reasoning
    • Image Inputs
    • Streaming Responses
    • Predictable Outputs
    • Advanced Generation Parameters
    • Tool Use
    • Tokens and Tokenizers
    • Summarizing Text
    • Safety Modes
  • Embeddings (Vectors, Search, Retrieval)
    • Introduction to Embeddings at Cohere
    • Semantic Search with Embeddings
    • Multimodal Embeddings
    • Batch Embedding Jobs
  • Going to Production
    • API Keys and Rate Limits
    • Going Live
    • Deprecations
    • How Does Cohere's Pricing Work?
  • Integrations
    • Integrating Embedding Models with Other Tools
    • Cohere and LangChain
    • LlamaIndex and Cohere
  • Deployment Options
    • Overview
    • SDK Compatibility
  • Tutorials
    • Cookbooks
    • LLM University
    • Build Things with Cohere!
    • Agentic RAG
    • Cohere on Azure
  • Responsible Use
    • Security
    • Usage Policy
    • Command A Technical Report
    • Command R and Command R+ Model Card
  • Cohere Labs
    • Cohere Labs Acceptable Use Policy
  • More Resources
    • Cohere Toolkit
    • Datasets
    • Improve Cohere Docs
LogoLogodocs
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
On this page
  • The input_type parameter
  • Multilingual Support
  • Image Embeddings
  • Support for Mixed Content Embeddings
  • Matryoshka Embeddings
  • Compression Levels
  • A Note on Bits and Bytes
Embeddings (Vectors, Search, Retrieval)

Introduction to Embeddings at Cohere

Was this page helpful?
Edit this page
Previous

Semantic Search with Embeddings

Next
Built with
embeddings.

Embeddings are a way to represent the meaning of texts, images, or information as a list of numbers. Using a simple comparison function, we can then calculate a similarity score for two embeddings to figure out whether two pieces of information are about similar things. Common use-cases for embeddings include semantic search, clustering, and classification.

In the example below we use the embed-v4.0 model to generate embeddings for 3 phrases and compare them using a similarity function. The two similar phrases have a high similarity score, and the embeddings for two unrelated phrases have a low similarity score:

PYTHON
1import cohere
2import numpy as np
3
4co = cohere.ClientV2(api_key="YOUR_API_KEY")
5
6# get the embeddings
7phrases = ["i love soup", "soup is my favorite", "london is far away"]
8
9model = "embed-v4.0"
10input_type = "search_query"
11
12res = co.embed(
13 texts=phrases,
14 model=model,
15 input_type=input_type,
16 output_dimension=1024,
17 embedding_types=["float"],
18)
19
20soup1, soup2, london = res.embeddings.float
21
22
23# compare them
24def calculate_similarity(a, b):
25 return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
26
27
28print(
29 f"For the following sentences:\n1: {phrases[0]}\n2: {phrases[1]}n\3: The similarity score is: {calculate_similarity(soup1, soup2):.2f}\n"
30)
31print(
32 f"For the following sentences:\n1: {phrases[0]}\n2: {phrases[2]}n\3: The similarity score is: {calculate_similarity(soup1, london):.2f}"
33)

The input_type parameter

Cohere embeddings are optimized for different types of inputs.

  • When using embeddings for semantic search, the search query should be embedded by setting input_type="search_query"
  • When using embeddings for semantic search, the text passages that are being searched over should be embedded with input_type="search_document".
  • When using embedding for classification and clustering tasks, you can set input_type to either ‘classification’ or ‘clustering’ to optimize the embeddings appropriately.
  • When input_type='image' for embed-v3.0, the expected input to be embedded is an image instead of text. If you use input_type=images with embed-v4.0 it will default to search_document. We recommend using search_document when working with embed-v4.0.

Multilingual Support

embed-v4.0 is a best-in-class best-in-class multilingual model with support for over 100 languages, including Korean, Japanese, Arabic, Chinese, Spanish, and French.

PYTHON
1import cohere
2
3co = cohere.ClientV2(api_key="YOUR_API_KEY")
4
5texts = [
6 "Hello from Cohere!",
7 "مرحبًا من كوهير!",
8 "Hallo von Cohere!",
9 "Bonjour de Cohere!",
10 "¡Hola desde Cohere!",
11 "Olá do Cohere!",
12 "Ciao da Cohere!",
13 "您好,来自 Cohere!",
14 "कोहेरे से नमस्ते!",
15]
16
17response = co.embed(
18 model="embed-v4.0",
19 texts=texts,
20 input_type="classification",
21 output_dimension=1024,
22 embedding_types=["float"],
23)
24
25embeddings = response.embeddings.float # All text embeddings
26print(embeddings[0][:5]) # Print embeddings for the first text

Image Embeddings

The Cohere Embedding platform supports image embeddings for embed-v4.0 and the embed-v3.0 family. There are two ways to access this functionality:

  • Pass image to the input_type parameter. Here are the steps:
    • Pass image to the input_type parameter
    • Pass your image URL to the images parameter
  • Pass your image URL to the new images parameter. Here are the steps:
    • Pass in a input list of dicts with the key content
    • content contains a list of dicts with the keys type and image

When using the images parameter the following restrictions exist:

  • Pass image to the input_type parameter (as discussed above).
  • Pass your image URL to the new images parameter.

Be aware that image embedding has the following restrictions:

  • If input_type='image', the texts field must be empty.
  • The original image file type must be in a png, jpeg, webp, or gif format and can be up to 5 MB in size.
  • The image must be base64 encoded and sent as a Data URL to the images parameter.
  • Our API currently does not support batch image embeddings for embed-v3.0 models. For embed-v4.0, however, you can submit up to 96 images.

When using the inputs parameter the following restrictions exist (note these restrictions apply to embed-v4.0):

  • The maximum size of payload is 20mb
  • All images larger than 2,458,624 pixels will be downsampled to 2,458,624 pixels
  • All images smaller than 3,136 (56x56) pixels will be upsampled to 3,136 pixels
  • input_type must be set to one of the following
    • search_query
    • search_document
    • classification
    • clustering

Here’s a code sample using the inputs parameter:

PYTHON
1import cohere
2from PIL import Image
3from io import BytesIO
4import base64
5
6co = cohere.ClientV2(api_key="YOUR_API_KEY")
7
8# The model accepts input in base64 as a Data URL
9
10
11def image_to_base64_data_url(image_path):
12 # Open the image file
13 with Image.open(image_path) as img:
14 image_format = img.format.lower()
15 buffered = BytesIO()
16 img.save(buffered, format=img.format)
17 # Encode the image data in base64
18 img_base64 = base64.b64encode(buffered.getvalue()).decode(
19 "utf-8"
20 )
21
22 # Create the Data URL with the inferred image type
23 data_url = f"data:image/{image_format};base64,{img_base64}"
24 return data_url
25
26
27base64_url = image_to_base64_data_url("<PATH_TO_IMAGE>")
28
29input = {
30 "content": [
31 {"type": "image_url", "image_url": {"url": base64_url}}
32 ]
33}
34
35res = co.embed(
36 model="embed-v4.0",
37 embedding_types=["float"],
38 input_type="search_document",
39 inputs=[input],
40 output_dimension=1024,
41)
42
43res.embeddings.float

Here’s a code sample using the images parameter:

PYTHON
1import cohere
2from PIL import Image
3from io import BytesIO
4import base64
5
6co = cohere.ClientV2(api_key="YOUR_API_KEY")
7
8# The model accepts input in base64 as a Data URL
9
10
11def image_to_base64_data_url(image_path):
12 # Open the image file
13 with Image.open(image_path) as img:
14 # Create a BytesIO object to hold the image data in memory
15 buffered = BytesIO()
16 # Save the image as PNG to the BytesIO object
17 img.save(buffered, format="PNG")
18 # Encode the image data in base64
19 img_base64 = base64.b64encode(buffered.getvalue()).decode(
20 "utf-8"
21 )
22
23 # Create the Data URL and assumes the original image file type was png
24 data_url = f"data:image/png;base64,{img_base64}"
25 return data_url
26
27
28processed_image = image_to_base64_data_url("<PATH_TO_IMAGE>")
29
30res = co.embed(
31 images=[processed_image],
32 model="embed-v4.0",
33 embedding_types=["float"],
34 input_type="image",
35)
36
37res.embeddings.float

Support for Mixed Content Embeddings

embed-v4.0 supports text and content-rich images such as figures, slide decks, document screen shots (i.e. screenshots of PDF pages). This eliminates the need for complex text extraction or ETL pipelines. Unlike our previous embed-v3.0 model family, embed-v4.0 is capable of processing both images and texts together; the inputs can either be an image that contains both text and visual content, or text and images that youd like to compress into a single vector representation.

Here’s a code sample illustrating how embed-v4.0 could be used to work with fused images and texts like the following:

Fused image and texts

PYTHON
1import cohere
2import base64
3
4# Embed an Images and Texts separately
5with open("./content/finn.jpeg", "rb") as image_file:
6 encoded_string = base64.b64encode(image_file.read()).decode(
7 "utf-8"
8 )
9
10# Step 3: Format as data URL
11data_url = f"data:image/jpeg;base64,{encoded_string}"
12
13example_doc = [
14 {"type": "text", "text": "This is a Scottish Fold Cat"},
15 {"type": "image_url", "image_url": {"url": data_url}},
16] # This is where we're fusing text and images.
17
18res = co.embed(
19 model="embed-v4.0",
20 inputs=[{"content": example_doc}],
21 input_type="search_document",
22 embedding_types=["float"],
23 output_dimension=1024,
24).embeddings.float_
25
26# This will return a list of length 1 with the texts and image in a combined embedding
27
28res

Matryoshka Embeddings

Matryoshka learning creates embeddings with coarse-to-fine representation within a single vector; embed-v4.0 supports multiple output dimensions in the following values: [256,512,1024,1536]. To access this, you specify the parameter output_dimension when creating the embeddings.

PYTHON
1texts = ["hello"]
2
3response = co.embed(
4 model="embed-v4.0",
5 texts=texts,
6 output_dimension=1024,
7 input_type="classification",
8 embedding_types=["float"],
9).embeddings
10
11# print out the embeddings
12response.float # returns a vector that is 1024 dimensions

Compression Levels

The Cohere embeddings platform supports compression. The Embed API features an embeddings_types parameter which allows the user to specify various ways of compressing the output.

The following embedding types are supported:

  • float
  • int8
  • unint8
  • binary
  • ubinary

We recommend being explicit about the embedding type(s). To specify an embedding types, pass one of the types from the list above in as list containing a string:

PYTHON
1res = co.embed(
2 texts=["hello_world"],
3 model="embed-v4.0",
4 input_type="search_document",
5 embedding_types=["int8"],
6)

You can specify multiple embedding types in a single call. For example, the following call will return both int8 and float embeddings:

PYTHON
1res = co.embed(
2 texts=phrases,
3 model="embed-v4.0",
4 input_type=input_type,
5 embedding_types=["int8", "float"],
6)
7
8res.embeddings.int8 # This contains your int8 embeddings
9res.embeddings.float # This contains your float embeddings

A Note on Bits and Bytes

When doing binary compression, there’s a subtlety worth pointing out: because Cohere packages bits as bytes under the hood, the actual length of the vector changes. This means that if you have a vector of 1024 binary embeddings, it will become 1024/8 => 128 bytes, and this might be confusing if you run len(embeddings). This code shows how to unpack it so it works if you’re using a vector database that does not take bytes for binary:

PYTHON
1res = co.embed(
2 model="embed-v4.0",
3 texts=["hello"],
4 input_type="search_document",
5 embedding_types=["ubinary"],
6 output_dimension=1024,
7)
8print(
9 f"Embed v4 Binary at 1024 dimensions results in length {len(res.embeddings.ubinary[0])}"
10)
11
12query_emb_bin = np.asarray(res.embeddings.ubinary[0], dtype="uint8")
13query_emb_unpacked = np.unpackbits(query_emb_bin, axis=-1).astype(
14 "int"
15)
16query_emb_unpacked = 2 * query_emb_unpacked - 1
17print(
18 f"Embed v4 Binary at 1024 unpacked will have dimensions:{len(query_emb_unpacked)}"
19)