New Look For Docs!
We've updated our docs to better suit our new developer journey! You'll have a sleeker, more streamlined documentation experience.
New Logit Bias experimental parameter
Our Generative models have now the option to use the new logit_bias parameter to prevent the model from generating unwanted tokens or to incentivize it to include desired tokens. Logit bias is supported in all our default Generative models.
Co.classify powered by our Representational model embeddings
The Co.classify endpoint now serves few-shot classification tasks using embeddings from our Representational model for the small, medium, and large default models.
Pricing Update and New Dashboard UI
- Free, rate limited Trial Keys for experimentation, testing, and playground usage
- Production keys with no rate limit for serving Cohere in production applications
- Flat rate pricing for Generate and Embed endpoints
- Reduced pricing for Classify endpoint
- New UI for dashboard including sign up and onboarding - everything except playground
- New use-case specific Quickstart Guides to learn about using Cohere API
- Replacing "Finetune" nomenclature with "Custom Model"
- Inviting team members is now more intuitive. Teams enable users to share custom models with each other
- Generative custom models now show accuracy and loss metrics alongside logs
- Embed and Classify custom models now show logs alongside accuracy, loss, precision, f1, recall
- Custom model details now show number of each label in dataset
Introducing Moderate (Beta)!
Use Moderate (Beta) to classify harmful text across the following categories: profane
, hate speech
, violence
, self-harm
, sexual
, sexual (non-consenual)
, harassment
, spam
, information hazard (e.g., pii)
. Moderate returns an array containing each category and its associated confidence score. Over the coming weeks, expect performance to improve significantly as we optimize the underlying model.
Model parameter now optional.
Our APIs no longer require a model to be specified. Each endpoint comes with great defaults. For more control, a model can still be specified by adding a model param in the request.
Updated Small, Medium, and Large Generation Models
Updated small
, medium
, and large
models are more stable and resilient against abnormal inputs due to a FP16 quantization fix. We also fixed a bug in generation presence & frequency penalty, which will result in more effective penalties.
New Extremely Large Model!
Our new and improved xlarge
has better generation quality and a 4x faster prediction speed. This model now supports a maximum token length of 2048 tokens and frequency and presence penalties.
New & Improved Generation and Representation Models
We've retrained our small
, medium
, and large
generation and representation models. Updated representation models now support contexts up to 4096 tokens (previously 1024 tokens). We recommend keeping text lengths below 512 tokens for optimal performance; for any text longer than 512 tokens, the text is spliced and the resulting embeddings of each component are then averaged and returned.
Finetuning Available + Policy Updates
Finetuning is Generally Available