Semantic Search
In this tutorial, we’ll explore semantic search using Cohere’s Embed modelon Azure AI Foundry.
Semantic search enables search systems to capture the meaning and context of search queries, going beyond simple keyword matching to find relevant results based on semantic similarity.
With the Embed model, you can do this across languages. This is particularly powerful for multilingual applications where the same meaning can be expressed in different languages.
In this tutorial, we’ll cover:
- Setting up the Cohere client
- Embedding text data
- Building a search index
- Performing semantic search queries
We’ll use Cohere’s Embed model deployed on Azure to demonstrate these capabilities and help you understand how to effectively implement semantic search in your applications.
Setup
First, you will need to deploy the Embed model on Azure via Azure AI Foundry. The deployment will create a serverless API with pay-as-you-go token based billing. You can find more information on how to deploy models in the Azure documentation.
In the example below, we are deploying the Embed Multilingual v3 model.
Once the model is deployed, you can access it via Cohere’s Python SDK. Let’s now install the Cohere SDK and set up our client.
To create a client, you need to provide the API key and the model’s base URL for the Azure endpoint. You can get these information from the Azure AI Foundry platform where you deployed the model.
Download dataset
For this example, we’ll be using MultiFIN - an open-source dataset of financial article headlines in 15 different languages (including English, Turkish, Danish, Spanish, Polish, Greek, Finnish, Hebrew, Japanese, Hungarian, Norwegian, Russian, Italian, Icelandic, and Swedish).
We’ve prepared a CSV version of the MultiFIN dataset that includes an additional column containing English translations. While we won’t use these translations for the model itself, they’ll help us understand the results when we encounter headlines in Danish or Spanish. We’ll load this CSV file into a pandas dataframe.
Pre-Process Dataset
For this example, we’ll work with a subset focusing on English, Spanish, and Danish content.
We’ll perform several pre-processing steps: removing any duplicate entries, filtering to keep only our three target languages, and selecting the 80 longest articles as our working dataset.
Embed and index documents
Let’s embed our documents and store the embeddings. These embeddings are high-dimensional vectors (1,024 dimensions) that capture the semantic meaning of each document. We’ll use Cohere’s embed-multilingual-v3.0 model that we have defined in the client setup.
The v3.0 embedding models require us to specify an input_type
parameter that indicates what we’re embedding. For semantic search, we use search_document
for the documents we want to search through, and search_query
for the search queries we’ll make later.
We’ll also keep track information about each document’s language and translation to provide richer search results.
Finally, we’ll build a search index with the hnsw
vector library to store these embeddings efficiently, enabling faster document searches.
Send Query and Retrieve Documents
Next, we build a function that takes a query as input, embeds it, and finds the three documents that are the most similar to the query.
Let’s now try to query the index with a couple of examples, one each in English and Danish.
With the first example, notice how the retrieval system was able to surface documents similar in meaning, for example, surfacing documents related to AI when given a query about data science. This is something that keyword-based search will not be able to capture.
As for the second example, this demonstrates the multilingual nature of the model. You can use the same model across different languages. The model can also perform cross-lingual search, such as the example of from the first retrieved document, where “PP&E guide” is an English term that stands for “property, plant, and equipment,”.
Summary
In this tutorial, we learned about:
- How to set up the Cohere client to use the Embed model deployed on Azure AI Foundry
- How to embed text data
- How to build a search index
- How to perform multilingualsemantic search
In the next tutorial, we’ll explore how to use the Rerank model for reranking search results.