Introducing Moderate (Beta)!
Use Moderate (Beta) to classify harmful text across the following categories: profane
, hate speech
, violence
, self-harm
, sexual
, sexual (non-consenual)
, harassment
, spam
, information hazard (e.g., pii)
. Moderate returns an array containing each category and its associated confidence score. Over the coming weeks, expect performance to improve significantly as we optimize the underlying model.