Cohere's Rerank Model (Details and Application)

Rerank models sort text inputs by semantic relevance to a specified query. They are often used to sort search results returned from an existing search solution. Learn more about using Rerank in the best practices guide.

Latest Model	Description	Modality	Endpoints
`rerank-v4.0-pro`	A multilingual model that allows for re-ranking English and non-english documents and semi-structured data (JSON). This is better suited for state-of-the-art quality and complex use-cases than its `fast` variant.	Text	Rerank
`rerank-v4.0-fast`	A light version of `rerank-v4.0-pro`, this is a multilingual model that allows for re-ranking English and non-english documents and semi-structured data (JSON). This model is better suited for low latency and high throughput use-cases than its `pro` variant.	Text	Rerank
`rerank-v3.5`	A model for documents and semi-structured data (JSON). Performs well in English and non-English languages; supports the same languages as embed-multilingual-v3.0. This model has a context length of 4096 tokens	Text	Rerank
`rerank-english-v3.0`	A model that allows for re-ranking English Language documents and semi-structured data (JSON). This model has a context length of 4096 tokens.	Text	Rerank
`rerank-multilingual-v3.0`	A model for documents and semi-structure data (JSON) that are not in English. Supports the same languages as `embed-multilingual-v3.0`. This model has a context length of 4096 tokens.	Text	Rerank

For each document included in a request, Rerank combines the tokens from the query with the tokens from the document and the combined total counts toward the context limit for a single document. If the combined number of tokens from the query and a given document exceeds the model’s context length for a single document, the document will automatically get chunked and processed in multiple inferences. See our best practice guide for more info about formatting documents for the Rerank endpoint.