Embed models can be used to generate embeddings from text or classify it based on various parameters. Embeddings can be used for estimating semantic similarity between two texts, choosing a sentence which is most likely to follow another sentence, or categorizing user feedback. When used with the Classify endpoint, embeddings can be used for any classification or analysis task.

English Models

Latest ModelDescriptionDimensionsMax Tokens (Context Length)Similarity MetricEndpoints
embed-english-v3.0A model that allows for text to be classified or turned into embeddings. English only.1024512Cosine SimilarityEmbed,
Embed Jobs
embed-english-light-v3.0A smaller, faster version of embed-english-v3.0. Almost as capable, but a lot faster. English only.384512Cosine SimilarityEmbed,
Embed Jobs
embed-english-v2.0Our older embeddings model that allows for text to be classified or turned into embeddings. English only4096512Cosine SimilarityClassify, Embed
embed-english-light-v2.0A smaller, faster version of embed-english-v2.0. Almost as capable, but a lot faster. English only.1024512Cosine SimilarityClassify, Embed

Multi-Lingual Models

Latest ModelDescriptionDimensionsMax Tokens (Context Length)Similarity MetricEndpoints
embed-multilingual-v3.0Provides multilingual classification and embedding support. See supported languages here.1024512Cosine SimilarityEmbed, Embed Jobs
embed-multilingual-light-v3.0A smaller, faster version of embed-multilingual-v3.0. Almost as capable, but a lot faster. Supports multiple languages.384512Cosine SimilarityEmbed,
Embed Jobs
embed-multilingual-v2.0Provides multilingual classification and embedding support. See supported languages here.768256Dot Product SimilarityClassify, Embed

List of Supported Languages

Our multilingual embed model supports over 100 languages, including Chinese, Spanish, and French.

ISO CodeLanguage Name
afAfrikaans
amAmharic
arArabic
asAssamese
azAzerbaijani
beBelarusian
bgBulgarian
bnBengali
boTibetan
bsBosnian
caCatalan
cebCebuano
coCorsican
csCzech
cyWelsh
daDanish
deGerman
elGreek
enEnglish
eoEsperanto
esSpanish
etEstonian
euBasque
faPersian
fiFinnish
frFrench
fyFrisian
gaIrish
gdScots_gaelic
glGalician
guGujarati
haHausa
hawHawaiian
heHebrew
hiHindi
hmnHmong
hrCroatian
htHaitian_creole
huHungarian
hyArmenian
idIndonesian
igIgbo
isIcelandic
itItalian
jaJapanese
jvJavanese
kaGeorgian
kkKazakh
kmKhmer
knKannada
koKorean
kuKurdish
kyKyrgyz
LaLatin
LbLuxembourgish
LoLaothian
LtLithuanian
LvLatvian
mgMalagasy
miMaori
mkMacedonian
mlMalayalam
mnMongolian
mrMarathi
msMalay
mtMaltese
myBurmese
neNepali
nlDutch
noNorwegian
nyNyanja
orOriya
paPunjabi
plPolish
ptPortuguese
roRomanian
ruRussian
rwKinyarwanda
siSinhalese
skSlovak
slSlovenian
smSamoan
snShona
soSomali
sqAlbanian
srSerbian
stSesotho
suSundanese
svSwedish
swSwahili
taTamil
teTelugu
tgTajik
thThai
tkTurkmen
tlTagalog
trTurkish
ttTatar
ugUighur
ukUkrainian
urUrdu
uzUzbek
viVietnamese
woWolof
xhXhosa
yiYiddish
yoYoruba
zhChinese
zuZulu

Frequently Asked Questions

What is the Context Length for Cohere Embeddings Models?

You can find the context length for various Cohere embeddings models in the tables above. It's in the "Max Tokens (Context Length)" column.