Rerank Model
Rerank models sort text inputs by semantic relevance to a specified query. They are often used to sort search results returned from an existing search solution. Learn more about using Rerank in the best practices guide.
For each document included in a request, Rerank combines the tokens from the query with the tokens from the document and the combined total counts toward the context limit for a single document. If the combined number of tokens from the query and a given document exceeds the model’s context length for a single document, the document will automatically get chunked and processed in multiple inferences. See our best practice guide for more info about formatting documents for the Rerank endpoint.