Introduction to Embeddings at Cohere

embeddings.

Embeddings are a way to represent the meaning of text as a list of numbers. Using a simple comparison function, we can then calculate a similarity score for two embeddings to figure out whether two texts are talking about similar things. Common use-cases for embeddings include semantic search, clustering, and classification.

In the example below we use the embed-english-v3.0 model to generate embeddings for 3 phrases and compare them using a similarity function. The two similar phrases have a high similarity score, and the embeddings for two unrelated phrases have a low similarity score:

PYTHON
1import cohere
2import numpy as np
3
4co = cohere.ClientV2(api_key="YOUR_API_KEY")
5
6# get the embeddings
7phrases = ["i love soup", "soup is my favorite", "london is far away"]
8
9model="embed-english-v3.0"
10input_type="search_query"
11
12res = co.embed(texts=phrases,
13 model=model,
14 input_type=input_type,
15 embedding_types=['float'])
16
17(soup1, soup2, london) = res.embeddings.float
18
19# compare them
20def calculate_similarity(a, b):
21 return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
22
23calculate_similarity(soup1, soup2) # 0.85 - very similar!
24calculate_similarity(soup1, london) # 0.16 - not similar!

The input_type parameter

Cohere embeddings are optimized for different types of inputs. For example, when using embeddings for semantic search, the search query should be embedded by setting input_type="search_query" whereas the text passages that are being searched over should be embedded with input_type="search_document". You can find more details and a code snippet in the Semantic Search guide. Similarly, the input type can be set to classification (example) and clustering to optimize the embeddings for those use cases.

Multilingual Support

In addition to embed-english-v3.0 we offer a best-in-class multilingual model embed-multilingual-v3.0 with support for over 100 languages, including Chinese, Spanish, and French. This model can be used with the Embed API, just like its English counterpart:

PYTHON
1import cohere
2co = cohere.ClientV2(api_key="<YOUR API KEY>")
3
4texts = [
5 'Hello from Cohere!', 'مرحبًا من كوهير!', 'Hallo von Cohere!',
6 'Bonjour de Cohere!', '¡Hola desde Cohere!', 'Olá do Cohere!',
7 'Ciao da Cohere!', '您好,来自 Cohere!', 'कोहेरे से नमस्ते!'
8]
9
10response = co.embed(
11 model='embed-multilingual-v3.0',
12 texts=texts,
13 input_type='classification',
14 embedding_types=['float'])
15
16embeddings = response.embeddings.float # All text embeddings
17print(embeddings[0][:5]) # Print embeddings for the first text

Compression Levels

The Cohere embeddings platform supports compression. The Embed API features a required parameter, embeddings_types, which allows the user to specify various ways of compressing the output.

The following embedding types are now supported:

  • float
  • int8
  • unint8
  • binary
  • ubinary

To specify an embedding type, pass one of the types from the list above in as list containing a string:

PYTHON
1ret = co.embed(texts=phrases,
2 model=model,
3 input_type=input_type,
4 embedding_types=['int8'])
5
6ret.embeddings.int8 # This contains your int8 embeddings
7ret.embeddings.float # This will be empty
8ret.embeddings.uint8 # This will be empty
9ret.embeddings.ubinary # This will be empty
10ret.embeddings.binary # This will be empty

Finally, you can also pass several embedding_types in as a list, in which case the endpoint will return a dictionary with both types available:

PYTHON
1ret = co.embed(texts=phrases,
2 model=model,
3 input_type=input_type,
4 embedding_types=['int8', 'float'])
5
6ret.embeddings.int8 # This contains your int8 embeddings
7ret.embeddings.float # This contains your float embeddings
8ret.embeddings.uint8 # This will be empty
9ret.embeddings.ubinary # This will be empty
10ret.embeddings.binary # This will be empty