Introduction to Cohere Embeddings

What are Embeddings

Embeddings are a flexible way to represent data — words, images, etc. — as points in an n-dimensional space. When done correctly, similar data points will be located close to one another and dissimilar data points will be located far away from one another.

When applied to language, embeddings function as way to capture the meaning of text as a vector (i.e. a list) of numbers. This is useful because once text is in this form, it can be compared to other text for similarity, clustering, classification, and other use cases.

In the example below, the embeddings for two similar phrases have a high similarity score, and the embeddings for two unrelated phrases have a low similarity score: