Multilingual Embed Models

At Cohere, we are committed to breaking down barriers and expanding access to cutting-edge NLP technologies that power projects across the globe. By making our innovative multilingual language models available to all developers, we continue to move toward our goal of empowering developers, researchers, and innovators with state-of-the-art NLP technologies that push the boundaries of Language AI.

Our Multilingual Model maps text to a semantic vector space, positioning text with a similar meaning in close proximity. This process unlocks a range of valuable use cases for multilingual settings. For example, one can map a query to this vector space during a search to locate relevant documents nearby. This often yields search results that are several times better than keyword search.

Use Cases

Get Started

To get started using the multilingual embed models, you can either query our endpoints or install our SDK to use the model within Python:

import cohere  
co = cohere.Client(api_key="<YOUR API KEY>")  
texts = [  
   'Hello from Cohere!', 'مرحبًا من كوهير!', 'Hallo von Cohere!',  
   'Bonjour de Cohere!', '¡Hola desde Cohere!', 'Olá do Cohere!',  
   'Ciao da Cohere!', '您好,来自 Cohere!', 'कोहेरे से नमस्ते!'  
]  
response = co.embed(texts=texts,input_type='classification', embedding_types=['float'], model='embed-multilingual-v3.0')  
embeddings = response.embeddings.float # All text embeddings 
print(embeddings[0][:5]) # Print embeddings for the first text

Model Performance

ModelClusteringSearch- EnglishSearch- MultilingualCross-lingual Classification
Cohere: embed-multilingual-v3.055.0766.8
Cohere: embed-multilingual-light-v3.05.0.7565.8
Cohere: embed-multilingual-v2.051.055.851.464.6
Sentence-transformers:
paraphrase-multilingual-mpnet-base-v2
46.744.415.356.1
Google: LaBSE41.020.913.259.2
Google: Universal Sentence Encoder40.114.33.459.8

List of Supported Languages

Our multilingual embed model supports over 100 languages, including Chinese, Spanish, and French.

ISO CodeLanguage Name
afAfrikaans
amAmharic
arArabic
asAssamese
azAzerbaijani
beBelarusian
bgBulgarian
bnBengali
boTibetan
bsBosnian
caCatalan
cebCebuano
coCorsican
csCzech
cyWelsh
daDanish
deGerman
elGreek
enEnglish
eoEsperanto
esSpanish
etEstonian
euBasque
faPersian
fiFinnish
frFrench
fyFrisian
gaIrish
gdScots_gaelic
glGalician
guGujarati
haHausa
hawHawaiian
heHebrew
hiHindi
hmnHmong
hrCroatian
htHaitian_creole
huHungarian
hyArmenian
idIndonesian
igIgbo
isIcelandic
itItalian
jaJapanese
jvJavanese
kaGeorgian
kkKazakh
kmKhmer
knKannada
koKorean
kuKurdish
kyKyrgyz
LaLatin
LbLuxembourgish
LoLaothian
LtLithuanian
LvLatvian
mgMalagasy
miMaori
mkMacedonian
mlMalayalam
mnMongolian
mrMarathi
msMalay
mtMaltese
myBurmese
neNepali
nlDutch
noNorwegian
nyNyanja
orOriya
paPunjabi
plPolish
ptPortuguese
roRomanian
ruRussian
rwKinyarwanda
siSinhalese
skSlovak
slSlovenian
smSamoan
snShona
soSomali
sqAlbanian
srSerbian
stSesotho
suSundanese
svSwedish
swSwahili
taTamil
teTelugu
tgTajik
thThai
tkTurkmen
tlTagalog
trTurkish
ttTatar
ugUighur
ukUkrainian
urUrdu
uzUzbek
viVietnamese
woWolof
xhXhosa
yiYiddish
yoYoruba
zhChinese
zuZulu