Text Classification
Among the most popular use cases for language embeddings is ‘text classification,’ in which different pieces of text — blog posts, lyrics, poems, headlines, etc. — are grouped based on their similarity, their sentiment, or some other property.
Here, we’ll discuss how to perform simple text classification tasks with Cohere’s classify
endpoint, and provide links to more information on how to fine-tune this endpoint for more specialized work.
Few-Shot Classification with Cohere’s classify
Endpoint
Generally, training a text classifier requires a tremendous amount of data. But with large language models, it’s now possible to create so-called ‘few shot’ classification models able to perform well after seeing a far smaller number of samples.
In the next few sections, we’ll create a sentiment analysis classifier to sort text into “positive,” “negative,” and “neutral” categories.
Setting up the SDK
First, let’s import the required tools and set up a Cohere client.
Preparing the Data and Inputs
With the classify
endpoint, you can create a text classifier with as few as two examples per class, and each example must contain the text itself and the corresponding label (i.e. class). So, if you have two classes you need a minimum of four examples, if you have three classes you need a minimum of six examples, and so on.
Here are examples, created as ClassifyExample
objects:
Besides the examples, you’ll also need the ‘inputs,’ which are the strings of text you want the classifier to sort. Here are the ones we’ll be using:
Generate Predictions
Setting up the model is quite straightforward with the classify
endpoint. We’ll use Cohere’s embed-english-v3.0
model, here’s what that looks like:
Here’s a sample output returned (note that this output has been truncated to make it easier to read, you’ll get much more in return if you run the code yourself):
Most of this is pretty easy to understand, but there are a few things worth drawing attention to.
Besides returning the predicted class in the prediction
field, the endpoint also returns the confidence
value of the prediction, which varies between 0 (unconfident) and 1 (completely confident).
Also, these confidence values are split among the classes; since we’re using three, the confidence values for the “positive,” “negative,” and “neutral” classes must add up to a total of 1.
Under the hood, the classifier selects the class with the highest confidence value as the “predicted class.” A high confidence value for the predicted class therefore indicates that the model is very confident of its prediction, and vice versa.
What If I Need to Fine-Tune the classify
endpoint?
Cohere has dedicated documentation on fine-tuning the classify
endpoint for bespoke tasks. You can also read this blog post, which works out a detailed example.