Semantic search - Cohere on Azure AI Foundry
In this tutorial, weâll explore semantic search using Cohereâs Embed modelon Azure AI Foundry.
Semantic search enables search systems to capture the meaning and context of search queries, going beyond simple keyword matching to find relevant results based on semantic similarity.
With the Embed model, you can do this across languages. This is particularly powerful for multilingual applications where the same meaning can be expressed in different languages.
In this tutorial, weâll cover:
- Setting up the Cohere client
- Embedding text data
- Building a search index
- Performing semantic search queries
Weâll use Cohereâs Embed model deployed on Azure to demonstrate these capabilities and help you understand how to effectively implement semantic search in your applications.
Setup
First, you will need to deploy the Embed model on Azure via Azure AI Foundry. The deployment will create a serverless API with pay-as-you-go token based billing. You can find more information on how to deploy models in the Azure documentation.
In the example below, we are deploying the Embed 4 model.
Once the model is deployed, you can access it via Cohereâs Python SDK. Letâs now install the Cohere SDK and set up our client.
To create a client, you need to provide the API key and the modelâs base URL for the Azure endpoint. You can get these information from the Azure AI Foundry platform where you deployed the model.
Download dataset
For this example, weâll be using MultiFIN - an open-source dataset of financial article headlines in 15 different languages (including English, Turkish, Danish, Spanish, Polish, Greek, Finnish, Hebrew, Japanese, Hungarian, Norwegian, Russian, Italian, Icelandic, and Swedish).
Weâve prepared a CSV version of the MultiFIN dataset that includes an additional column containing English translations. While we wonât use these translations for the model itself, theyâll help us understand the results when we encounter headlines in Danish or Spanish. Weâll load this CSV file into a pandas dataframe.
Pre-Process Dataset
For this example, weâll work with a subset focusing on English, Spanish, and Danish content.
Weâll perform several pre-processing steps: removing any duplicate entries, filtering to keep only our three target languages, and selecting the 80 longest articles as our working dataset.
Embed and index documents
Letâs embed our documents and store the embeddings. These embeddings are high-dimensional vectors (1,024 dimensions) that capture the semantic meaning of each document. Weâll use Cohereâs Embed 4 model that we have defined in the client setup.
The Embed 4 model require us to specify an input_type
parameter that indicates what weâre embedding. For semantic search, we use search_document
for the documents we want to search through, and search_query
for the search queries weâll make later.
Weâll also keep track information about each documentâs language and translation to provide richer search results.
Finally, weâll build a search index with the hnsw
vector library to store these embeddings efficiently, enabling faster document searches.
Send Query and Retrieve Documents
Next, we build a function that takes a query as input, embeds it, and finds the three documents that are the most similar to the query.
Letâs now try to query the index with a couple of examples, one each in English and Danish.
With the first example, notice how the retrieval system was able to surface documents similar in meaning, for example, surfacing documents related to AI when given a query about data science. This is something that keyword-based search will not be able to capture.
As for the second example, this demonstrates the multilingual nature of the model. You can use the same model across different languages. The model can also perform cross-lingual search, such as the example of from the first retrieved document, where âPP&E guideâ is an English term that stands for âproperty, plant, and equipment,â.
Summary
In this tutorial, we learned about:
- How to set up the Cohere client to use the Embed model deployed on Azure AI Foundry
- How to embed text data
- How to build a search index
- How to perform multilingualsemantic search
In the next tutorial, weâll explore how to use the Rerank model for reranking search results.