Embeddings are a way to represent the meaning of texts, images, or information as a list of numbers. Using a simple comparison function, we can then calculate a similarity score for two embeddings to figure out whether two pieces of information are about similar things. Common use-cases for embeddings include semantic search, clustering, and classification.
In the example below we use the embed-v4.0 model to generate embeddings for 3 phrases and compare them using a similarity function. The two similar phrases have a high similarity score, and the embeddings for two unrelated phrases have a low similarity score:
input_type parameterCohere embeddings are optimized for different types of inputs.
input_type="search_query"input_type="search_document".classification and clustering tasks, you can set input_type to either ‘classification’ or ‘clustering’ to optimize the embeddings appropriately.input_type='image' for embed-v3.0, the expected input to be embedded is an image instead of text. If you use input_type=images with embed-v4.0 it will default to search_document. We recommend using search_document when working with embed-v4.0.embed-v4.0 is a best-in-class best-in-class multilingual model with support for over 100 languages, including Korean, Japanese, Arabic, Chinese, Spanish, and French.
The Cohere Embedding platform supports image embeddings for embed-v4.0 and the embed-v3.0 family. There are two ways to access this functionality:
image to the input_type parameter. Here are the steps:
input_type parameterimages parameter. Here are the steps:
dicts with the key contentdicts with the keys type and imageWhen using the images parameter the following restrictions exist:
image to the input_type parameter (as discussed above).images parameter.Be aware that image embedding has the following restrictions:
input_type='image', the texts field must be empty.png, jpeg, webp, or gif format and can be up to 5 MB in size.images parameter.embed-v3.0 models. For embed-v4.0, however, you can submit up to 96 images.When using the inputs parameter the following restrictions exist (note these restrictions apply to embed-v4.0):
input_type must be set to one of the following
search_querysearch_documentclassificationclusteringHere’s a code sample using the inputs parameter:
Here’s a code sample using the images parameter:
embed-v4.0 supports text and content-rich images such as figures, slide decks, document screen shots (i.e. screenshots of PDF pages). This eliminates the need for complex text extraction or ETL pipelines. Unlike our previous embed-v3.0 model family, embed-v4.0 is capable of processing both images and texts together; the inputs can either be an image that contains both text and visual content, or text and images that youd like to compress into a single vector representation.
Here’s a code sample illustrating how embed-v4.0 could be used to work with fused images and texts like the following:

Matryoshka learning creates embeddings with coarse-to-fine representation within a single vector; embed-v4.0 supports multiple output dimensions in the following values: [256,512,1024,1536]. To access this, you specify the parameter output_dimension when creating the embeddings.
The Cohere embeddings platform supports compression. The Embed API features an embeddings_types parameter which allows the user to specify various ways of compressing the output.
The following embedding types are supported:
floatint8unint8binaryubinaryWe recommend being explicit about the embedding type(s). To specify an embedding types, pass one of the types from the list above in as list containing a string:
You can specify multiple embedding types in a single call. For example, the following call will return both int8 and float embeddings:
When doing binary compression, there’s a subtlety worth pointing out: because Cohere packages bits as bytes under the hood, the actual length of the vector changes. This means that if you have a vector of 1024 binary embeddings, it will become 1024/8 => 128 bytes, and this might be confusing if you run len(embeddings). This code shows how to unpack it so it works if you’re using a vector database that does not take bytes for binary: