An Overview of Cohere’s Rerank Model
How Rerank Works
The Rerank API endpoint, powered by the Rerank models, is a simple and very powerful tool for semantic search. Given a query
and a list of documents
, Rerank indexes the documents from most to least semantically relevant to the query.
Get Started
Example with Texts
In the example below, we use the Rerank API endpoint to index the list of documents
from most to least relevant to the query What is the capital of the United States?
.
Request
In this example, the documents being passed in are a list of strings:
1 import cohere 2 3 co = cohere.Client(api_key="<YOUR API KEY>") 4 5 query = "What is the capital of the United States?" 6 docs = [ 7 "Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274.", 8 "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan.", 9 "Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.", 10 "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America.", 11 "Capital punishment has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states. The federal government (including the United States military) also uses capital punishment.", 12 ] 13 results = co.rerank( 14 model="rerank-v3.5", 15 query=query, 16 documents=docs, 17 top_n=5, 18 return_documents=True, 19 )
Response
1 { 2 "id": "97813271-fe74-465d-b9d5-577e77079253", 3 "results": [ 4 { 5 "document": { 6 "text": "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America." 7 }, 8 "index": 3, 9 "relevance_score": 0.9990564 10 }, 11 { 12 "document": { 13 "text": "Capital punishment has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states. The federal government (including the United States military) also uses capital punishment." 14 }, 15 "index": 4, 16 "relevance_score": 0.7516481 17 }, 18 { 19 "document": { 20 "text": "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan." 21 }, 22 "index": 1, 23 "relevance_score": 0.08882029 24 }, 25 { 26 "document": { 27 "text": "Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274." 28 }, 29 "index": 0, 30 "relevance_score": 0.058238626 31 }, 32 { 33 "document": { 34 "text": "Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas." 35 }, 36 "index": 2, 37 "relevance_score": 0.019946935 38 } 39 ], 40 "meta": { 41 "api_version": { 42 "version": "2022-12-06" 43 }, 44 "billed_units": { 45 "search_units": 1 46 } 47 } 48 }
Example with Semi-structured Data:
Alternatively, you can pass in a JSON object and specify the fields you’d like to rank over. If you do not pass in any rank_fields
, it will default to the text key.
Request
1 query = "What is the capital of the United States?" 2 docs = [ 3 { 4 "Title": "Facts about Carson City", 5 "Content": "Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, Carson City had a population of 55,274.", 6 }, 7 { 8 "Title": "The Commonwealth of Northern Mariana Islands", 9 "Content": "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that are a political division controlled by the United States. Its capital is Saipan.", 10 }, 11 { 12 "Title": "The Capital of United States Virgin Islands", 13 "Content": "Charlotte Amalie is the capital and largest city of the United States Virgin Islands. It has about 20,000 people. The city is on the island of Saint Thomas.", 14 }, 15 { 16 "Title": "Washington D.C.", 17 "Content": "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district. The President of the USA and many major national government offices are in the territory. This makes it the political center of the United States of America.", 18 }, 19 { 20 "Title": "Capital Punishment in the US", 21 "Content": "Capital punishment has existed in the United States since before the United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states. The federal government (including the United States military) also uses capital punishment.", 22 }, 23 ] 24 results = co.rerank( 25 model="rerank-v3.5", 26 query=query, 27 documents=docs, 28 rank_fields=["Title", "Content"], 29 top_n=5, 30 return_documents=True, 31 )
In the docs
parameter, we are passing in a list of objects which have the key values: ['Title' ,'Content']
. As part of the Rerank call, we are specifying which keys to rank over, as well as the order in which the key value pairs should be considered.
1 RerankResponse( 2 id="e8f55f3f-d86e-47d7-9b24-7feb18286505", 3 results=[ 4 RerankResponseResultsItem( 5 document=RerankResponseResultsItemDocument( 6 text=None, 7 Content=( 8 "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) " 9 "is the capital of the United States. It is a federal district. The President of the USA and many major " 10 "national government offices are in the territory. This makes it the political center of the United States of America." 11 ), 12 Title="Washington D.C.", 13 ), 14 index=3, 15 relevance_score=0.8914433, 16 ), 17 RerankResponseResultsItem( 18 document=RerankResponseResultsItemDocument( 19 text=None, 20 Content=( 21 "Charlotte Amalie is the capital and largest city of the United States Virgin Islands. " 22 "It has about 20,000 people. The city is on the island of Saint Thomas." 23 ), 24 Title="The Capital of United States Virgin Islands", 25 ), 26 index=2, 27 relevance_score=0.40344992, 28 ), 29 RerankResponseResultsItem( 30 document=RerankResponseResultsItemDocument( 31 text=None, 32 Content=( 33 "Carson City is the capital city of the American state of Nevada. At the 2010 United States Census, " 34 "Carson City had a population of 55,274." 35 ), 36 Title="Facts about Carson City", 37 ), 38 index=0, 39 relevance_score=0.23343581, 40 ), 41 RerankResponseResultsItem( 42 document=RerankResponseResultsItemDocument( 43 text=None, 44 Content=( 45 "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean that " 46 "are a political division controlled by the United States. Its capital is Saipan." 47 ), 48 Title="The Commonwealth of Northern Mariana Islands", 49 ), 50 index=1, 51 relevance_score=0.15964958, 52 ), 53 RerankResponseResultsItem( 54 document=RerankResponseResultsItemDocument( 55 text=None, 56 Content=( 57 "Capital punishment has existed in the United States since before the United States was a country. " 58 "As of 2017, capital punishment is legal in 30 of the 50 states. The federal government (including the United States military) " 59 "also uses capital punishment." 60 ), 61 Title="Capital Punishment in the US", 62 ), 63 index=4, 64 relevance_score=0.10465127, 65 ), 66 ], 67 meta=ApiMeta( 68 api_version=ApiMetaApiVersion( 69 version="1", is_deprecated=None, is_experimental=None 70 ), 71 billed_units=ApiMetaBilledUnits( 72 images=None, 73 input_tokens=None, 74 output_tokens=None, 75 search_units=1.0, 76 classifications=None, 77 ), 78 tokens=None, 79 warnings=None, 80 ), 81 )
Multilingual Reranking
Cohere’s Rerank models have been trained for performance across 100+ languages.
When choosing the model, please note the following language support:
- Rerank 3.0: Separate English-only and multilingual models (
rerank-english-v3.0
andrerank-multilingual-v3.0
) - Rerank 3.5: A single multilingual model (
rerank-v3.5
)
The following table provides the list of languages supported by the Rerank models. Please note that performance may vary across languages.
ISO Code | Language Name |
---|---|
af | Afrikaans |
am | Amharic |
ar | Arabic |
as | Assamese |
az | Azerbaijani |
be | Belarusian |
bg | Bulgarian |
bn | Bengali |
bo | Tibetan |
bs | Bosnian |
ca | Catalan |
ceb | Cebuano |
co | Corsican |
cs | Czech |
cy | Welsh |
da | Danish |
de | German |
el | Greek |
en | English |
eo | Esperanto |
es | Spanish |
et | Estonian |
eu | Basque |
fa | Persian |
fi | Finnish |
fr | French |
fy | Frisian |
ga | Irish |
gd | Scots_gaelic |
gl | Galician |
gu | Gujarati |
ha | Hausa |
haw | Hawaiian |
he | Hebrew |
hi | Hindi |
hmn | Hmong |
hr | Croatian |
ht | Haitian_creole |
hu | Hungarian |
hy | Armenian |
id | Indonesian |
ig | Igbo |
is | Icelandic |
it | Italian |
ja | Japanese |
jv | Javanese |
ka | Georgian |
kk | Kazakh |
km | Khmer |
kn | Kannada |
ko | Korean |
ku | Kurdish |
ky | Kyrgyz |
La | Latin |
Lb | Luxembourgish |
Lo | Laothian |
Lt | Lithuanian |
Lv | Latvian |
mg | Malagasy |
mi | Maori |
mk | Macedonian |
ml | Malayalam |
mn | Mongolian |
mr | Marathi |
ms | Malay |
mt | Maltese |
my | Burmese |
ne | Nepali |
nl | Dutch |
no | Norwegian |
ny | Nyanja |
or | Oriya |
pa | Punjabi |
pl | Polish |
pt | Portuguese |
ro | Romanian |
ru | Russian |
rw | Kinyarwanda |
si | Sinhalese |
sk | Slovak |
sl | Slovenian |
sm | Samoan |
sn | Shona |
so | Somali |
sq | Albanian |
sr | Serbian |
st | Sesotho |
su | Sundanese |
sv | Swedish |
sw | Swahili |
ta | Tamil |
te | Telugu |
tg | Tajik |
th | Thai |
tk | Turkmen |
tl | Tagalog |
tr | Turkish |
tt | Tatar |
ug | Uighur |
uk | Ukrainian |
ur | Urdu |
uz | Uzbek |
vi | Vietnamese |
wo | Wolof |
xh | Xhosa |
yi | Yiddish |
yo | Yoruba |
zh | Chinese |
zu | Zulu |