Toxicity Detection - Cohere Docs

📘
This is an interactive tutorial!
To run this tutorial, click on Examples and select one of the options.

The internet is dominated by user-generated content. While it provides an avenue for online platforms to grow, it is a bane for content moderators managing them. It is impossible for humans to manually moderate all the user content that is created. This is why an automated solution is needed, such as in flagging for toxic content.

Here we look at an example of classifying online user comments for toxicity by classifying them in Toxic or Not Toxic.

Set up

Install the SDK.

$ pip install cohere

Set up the Cohere client.

import cohere  
co = cohere.Client(api_key)

Add examples

These are the training examples we give the model to show the classes we want it to classify. Each example contains the text itself and the corresponding label, or class. The minimum number of examples required is five per class.

from cohere.responses.classify import Example

examples = [
  Example("you are hot trash", "Toxic"),  
  Example("go to hell", "Toxic"),
  Example("get rekt moron", "Toxic"),  
  Example("get a brain and use it", "Toxic"), 
  Example("say what you mean, you jerk.", "Toxic"), 
  Example("Are you really this stupid", "Toxic"), 
  Example("I will honestly kill you", "Toxic"),  
  Example("yo how are you", "Benign"),  
  Example("I'm curious, how did that happen", "Benign"),  
  Example("Try that again", "Benign"),  
  Example("Hello everyone, excited to be here", "Benign"), 
  Example("I think I saw it first", "Benign"),  
  Example("That is an interesting point", "Benign"), 
  Example("I love this", "Benign"), 
  Example("We should try that sometime", "Benign"), 
  Example("You should go for it", "Benign")
]

Add inputs

These are the list of text pieces you’d like to classify.

inputs = [
  "this game sucks, you suck",  
	"stop being a dumbass",
	"Let's do this once and for all",
  "This is coming along nicely"  
]

Get classifications

With the Classify endpoint, setting up the model is quite straightforward. The main thing to do is to define the model type. For our example, we’ll use the default, which is large. Putting everything together with the Classify endpoint looks like the following:

response = co.classify(  
    model='large',  
    inputs=inputs,  
    examples=examples)

  print(response.classifications)

📘This is an interactive tutorial!

Set up

Add examples

Add inputs

Get classifications

📘
This is an interactive tutorial!