Frequently Asked Questions About Cohere

Here, we’ll walk through some common questions we get about how Cohere’s models work, what pricing options there are, and more!

Cohere Models

What is the difference between the Command R and Command R+ models?

Command R+ is most suitable for those workflows that lean on complex RAG functionality and multi-step tool use (agents). Command R, on the other hand, is great for simpler retrieval augmented generation (RAG) and single-step tool use tasks, as well as applications where price is a major consideration. We offer a full model overview in our documentation.

What is the difference between Aya and Command R models?

Aya specializes in human-like multilingual text generation and conversations, ideal for content creation and chatbots. Command R excels at understanding and executing instructions, enabling interactive applications and data-driven tasks.This makes it more suitable for many enterprise use cases.

You can check out this link to learn more about Aya models, datasets and related research papers.

How do Cohere's models compare to other LLMs on the market?

Cohere’s Command models have strong performance across enterprise tasks such as summarization, multilingual use cases, and retrieval augmented generation. We also have the widest range of deployment options, you can check it here.

How can I use Cohere's models for tasks like translation, text embedding, summarization, and custom tool development?

You can access Cohere’s models through our platform (cohere.com) or through various cloud platforms including, but not limited to, Sagemaker, Bedrock, Azure AI, and OCI Generatie AI. We also have private deployments. In terms of use case specific features, please reference the latest API documentation to learn more about the API features and Cookbooks with starter code for various tasks to aid development.

What are some best practices, tips, and techniques for prompt engineering?

You can find our prompt engineering recommendations in the following resources:

How can I effectively use and fine-tune models for specific tasks, like data extraction, question answering, and generating content within certain constraints?

To fine-tune models for tasks like data extraction, question answering, or content generation, it’s important to start by defining your goals and ensuring your data captures the task accurately.

For generative models, fine-tuning involves training on input-output pairs, where the model learns to generate specific outputs based on given inputs. This is ideal for tasks like customizing responses or enforcing a particular writing style.

For tasks like data extraction, fine-tuning helps the model identify relevant patterns and structure data as needed. High-quality, task-specific data is essential for achieving accurate results.

For more details, you can refer to Cohere’s fine-tuning guide for best practices.

Fine tuning is a powerful capability, but takes some effort to get right. You should first understand what you are trying to achieve and then determine if the data you are planning to train on effectively captures that task. The generative models specifically learn off of input/output pairs and therefore need to see examples of the expected input for your task and the ideal output. For more information, see our finetuning guide.

What are the best practices for preparing and structuring fine-tuning data, and what are the supported file formats?

You can find the best practices for preparing and structuring fine-tuning data across these three modules. Data preparation for chat fine-tuning, classify fine-tuning and rerank fine-tuning. The primary file formats supported are jsonl and csv.

What models are available for fine-tuning using the Cohere platform?

On the generative side we support fine-tuning for Command R and Command R 082024. On the representation side, we support fine-tuning for Classify and Rerank models. You can learn more about it in this section of our docs.

What specific models are being developed by Cohere and where can I find detailed information about them?

For the latest current offerings, you should reference our models page.

Which model should I choose for my specific use case?

This largely depends on your use case. In general, Cohere has both generative and representation models. The models page has more information on each of these, but use cases can often use a combination of models.

What are the capabilities of Cohere's models?

Cohere models offer a wide range of capabilities, from advanced generative tasks to semantic search and other representation use cases. All of our models are multilingual and can support use cases from RAG to Tool Use and much more.

Our Command model family is our flagship series of generative models. These models excel at taking a user instruction (or command) and generating text following the instruction. They also have conversational capabilities which means that they are well-suited for chatbots and virtual assistants.

For representation tasks, we offer two key models:

Embed: Embed models generate embeddings from text, allowing for tasks like classification, clustering, and semantic search.
Rerank: Rerank models improve the output of search and ranking systems by re-organizing results according to specific parameters, improving the relevance and accuracy of search results.

Our models perform best when used end-to-end in their intended workflows. For a detailed breakdown of each model, including their latest versions, check our models page.

What are the best practices and resources for building a search system for large PDF documents, and how can I optimize the retrieval process using language models and embeddings?

While this depends on the document structure itself, the best rule of thumb would be to split the PDF into its pages and then split each page into chunks that fit our context length.

From there, you should associate each chunk to a page and a doc id which will allow you to have various levels of granularity for retrieval.

You can find further guides on chunking strategies and handling PDFs with mixed data.

How can I develop a multilingual chatbot that can understand and respond to user queries in different languages, incorporate external data, and perform tasks like text search, citation generation, and answer reranking?

Cohere’s models offer multilingual capabilities out of the box. You can reference our example notebooks such as this RAG one to get a better idea of how to piece these models together to build a question answering application.

What are the implications and limitations of using an unsupported language in Command-R, and are there plans to expand language support?

We are always looking to expand multilingual support to other languages. Command R/R+ have been exposed to other languages during training and we encourage you to try it on your use case. If you would like to provide feedback or suggestions on additional languages, please don’t hesitate to contact support@cohere.com.

Which languages are supported by Cohere models?

Cohere’s command models are optimized to perform well in the following languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic.

Additionally, pre-training data has been included for the following 13 languages: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian.

You can find a full list of languages that are supported by Cohere’s multilingual embedding model here.

What kind of use case scenarios can Cohere models be useful in?

You can check the range of use cases based on our customer stories here.

Model Deployment

What Cohere models can I access via my cloud provider?

You can find the updated cloud support listed in our documentation. Check out links to our models on AWS Bedrock, AWS SageMaker, Azure AI, and OCI Generative AI.

What are the options and availability for on-premises deployment of Cohere's models?

We have the ability to deploy all of our models privately. To learn more, please reach out to the sales team using this form.

Can I get an enterprise license for on-premise deployment of Cohere models for commercial use, and are there any options for self-deployment?

Please reach out to the sales team to learn more.

What are the deployment options and considerations for Cohere models?

To learn more, please reach out to the sales team using this form.

Is the licensing for self-deployed models non-commercial or research-only?

The default license for our open weights is for non-commercial use. For information about licensing please reach out to the sales team using this form.

What are the requirements, resources, and methods needed to deploy Cohere models, especially when dealing with specific use cases, confidentiality, and resource constraints?

Please check our deployment options here and contact our sales team with this form to learn more.

Platform & API

How can I monitor and manage my API usage limits, and what steps can I take if I need higher limits or encounter issues with my current limits?

We offer two kinds of API keys: trial keys (with a variety of attendant limitations), and production keys (which have no such limitations). You can learn about them in this section of our documentation.

How can I access Cohere API for personal projects and prototyping?

We make a distinction between “trial” and “production” usage of an API key.

Trial API key usage is free, but limited. You can test different applications or build proofs of concept using all of Cohere’s models and APIs with a trial key by simply signing up for a Cohere account here.

What are the rate limits for different Cohere API endpoints and plan types, and are there any differences in response times?

Please refer to API Keys and Rate Limits section of our documentation.

Is there a way to provide feedback, ask questions, or report issues directly to the Cohere team or community?

You can contact our support team at support@cohere.com and get help and share your feedback with our team and developer community via the Cohere Discord server.

Getting Started

How do I use the Cohere API?

The Cohere API can be accessed through the SDK. We support SDKs in 4 different languages, Python, Typescript, Java, and Go.

Visit the API docs for further details.

Where can I access Cohere's Chatbot Playground or Dashboard?

Here are the relevant links:

Dashboard

Where can I find a comprehensive overview and resources about Cohere's products, use cases, and various offerings?

You can find the resources as follows:

Model pages: Command, Embed, and Rerank.
For business
Cohere documentation

Where can I find resources to start learning and building on Cohere?

For learning, we recommend our LLM University hub resources, which have been prepared by Cohere experts. These include a number of very high-quality, step-by-step guides to help you start building quickly.

For building, we recommend checking out our Github Notebooks, as well as the Get Started and Cookbooks sections in our documentation.

What are some best practices and techniques for prompt engineering, specifically when incorporating documents into a chat model's response?

For general recommendations on prompt engineering check the following resources:

Prompt Engineering Basics Guide
Tips on Crafting Effective Prompts
Techniques of Advanced Prompt Engineering.

For the most reliable results when working with external document sources, we recommend using a technique called Retrieval-Augmented Generation (RAG). You can learn about it here:

Where can I find code examples and tutorials for using the Cohere API with various programming languages and frameworks?

You can find a list of comprehensive tutorials and code examples in our LLM University hub and the Cookbook guides.

What are some project ideas or suggestions for developers using Cohere models?

Check out our Cookbooks, which include step-by-step guides and project examples, and the Cohere Discord server for inspiration from our developer community.

How can I access LLM University?

LLMU can be accessed directly from the Cohere website. We periodically add more content and highly recommend you follow us on our socials to stay up to date.

Where can I find the documentation for Cohere's models and features?

You can find the documentation with the full Cohere model and feature overview here.

Troubleshooting Errors

When using Cohere's API, why am I encountering errors related to dataset creation, API key limitations, or missing artifacts, and how can these issues be resolved?

Here are some common errors and potential solutions for dealing with errors related to API key limitations or missing artifacts.

API Key Limitations

Cohere’s API keys have certain limitations and permissions associated with them. If you are encountering errors related to API key limitations, it could be due to the following reasons:

Rate Limits: Cohere’s API has rate limits in place to ensure fair usage. If you exceed the allowed number of requests within a specific time frame, you may receive an error. To resolve this, double check the rate limits for your API plan and ensure your usage is within the specified limits. You can also implement a rate-limiting mechanism in your code to control the frequency of API requests.
API Key Expiration: API keys may have an expiration date. If your key has expired, it will no longer work.Check the validity period of your API key and renew it if necessary. Contact Cohere’s support team if you need assistance with key renewal.

Missing Artifacts

Cohere’s dataset creation process involves generating artifacts, which are essential components for training models. If you receive errors about missing artifacts, consider the following:

Incorrect Dataset Format: Ensure your dataset is in the correct format required by Cohere’s API. Different tasks (e.g., classification, generation) may have specific formatting requirements. Review the documentation for dataset formatting guidelines and ensure your data adheres to the specified structure.
File Upload Issues: Artifacts are generated after successfully uploading your dataset files. Issues with file uploads can lead to missing artifacts. Verify that your dataset files are accessible and not corrupted. You should also check file size limits to ensure your files meet the requirements.
Synchronization Delay: Sometimes, there might be a slight delay in generating artifacts after uploading the dataset. Wait for a few minutes and refresh the dataset status to see if the artifacts are generated.

General Troubleshooting Steps

If your problem doesn’t fall into these buckets, here are a few other things you can try:

Check API Documentation: Review the Cohere API documentation for dataset creation to ensure you are following the correct steps and parameters.
Inspect API Responses: Carefully examine the error responses returned by the API. They often contain valuable information about the issue. Cohere uses conventional HTTP response codes to indicate the success or failure of an API request. In general:
- Codes in the 2xx range indicate success.
- Codes in the 4xx range indicate an error that failed given the information provided (e.g., a required parameter was omitted, a charge failed, etc.).
- Codes in the 5xx range indicate an error with Cohere’s servers (these are rare).

Review the Errors page to learn more about how to deal with non-2xx response code.

Reach Out to Cohere Support

If the issue persists, contact Cohere’s support team. They can provide personalized assistance and help identify any specific problems with your API integration.

Why am I unable to access and log in to the Cohere dashboard?

If you’re encountering difficulties logging into your Cohere dashboard, there could be a few reasons.

First, check our status page at status.cohere.com to see if any known issues or maintenance activities might impact your access.

If the status page doesn’t indicate any ongoing issues, the next step would be to reach out to our support teams. They’re always ready to assist and can be contacted at support@cohere.com. Our support team will be able to investigate further and provide you with the necessary guidance to resolve the login issue.

How can I resolve issues with logging in and authentication?

We understand that login and authentication issues can be frustrating. Here are some steps you can take to troubleshoot and resolve these problems:

Check Your Credentials: Ensure you use the correct username and password. It’s easy to make a typo, so double-check your credentials before logging in again.
Clear Cache and Cookies: Sometimes, issues with logging in can be caused by cached data or cookies on your device. Try clearing your browser’s cache and cookies, then attempt to log in again.
Contact Support: If none of the above steps resolve the issue, it’s time to contact our support team. We are equipped to handle a wide range of login and authentication issues and can provide further assistance. You can contact us at support@cohere.com.

What troubleshooting steps would you suggest for an issue suddenly occurring in a previously functional system or script?

If you’re facing any technical challenges or need guidance, our support team is here to help. Contact us at support@cohere.com, and our technical support engineers will provide the necessary assistance and expertise to resolve your issues.

Billing, Pricing, Licensing, Account Management

How can I get in touch with Cohere's support team?

Please reach out to our support team at support@cohere.com. When reaching out to the support team, please keep the following questions in mind:

What model are you referring to?
Copy paste the error message
- Please note that this is our error message information:
  - 400 - invalid combination of parameters
  - 422 - request is malformed (eg: unsupported enum value, unknown param)
  - 499 - request is canceled by the user
  - 401 - invalid api token (not relevant on AWS)
  - 404 - model not found (not relevant on AWS)
  - 429 - rate limit reached (not relevant on AWS)
What is the request seq length you are passing in?
What are the generation max tokens you are requesting?
Are all the requests of various input/output shapes failing?
Share any logs

Where can I find information about Cohere's pricing?

Please refer to our dedicated pricing page for most up-to-date pricing.

How can I manage and understand the rate limits and usage of my API key?

Cohere offers two types of API keys: trial keys and production keys.

Trial Key Limitations

Trial keys are rate-limited depending on the endpoint you want to use. For example, the Embed endpoint is limited to 5 calls per minute, while the Chat endpoint is limited to 20 calls per minute. All other endpoints on trail keys are 1,000 calls per month. If you want to use Cohere endpoints in a production application or require higher throughput, you can upgrade to a production key.

Production Key Specifications

Production keys for all endpoints are rate-limited at 1,000 calls per minute, with unlimited monthly use and are intended for serving Cohere in a public-facing application and testing purposes. Usage of production keys is metered at price points which can be found on the Cohere pricing page.

To get a production key, you’ll need to be the admin of your organization or ask your organization’s admin to create one. Please visit your API Keys > Dashboard, where the process should take less than three minutes and will generate a production key that you can use to serve Cohere APIs in production.

How can I monitor and manage my token usage and API calls for personal projects within the limitations of a free plan?

Cohere offers a convenient way to keep track of your usage and billing information. All our endpoints provide this data as metadata for each conversation, which is directly accessible via the API. This ensures you can easily monitor your usage. Our Dashboard provides an additional layer of control for standard accounts. You can set a monthly spending limit to manage your expenses effectively. To learn more about this feature and how to enable it, please visit the Billing & Usage section on the Dashboard, specifically the Spending Limit tab.

What is the process for making changes to my account, and who should I contact for specific requests?

If you need to make changes to your account or have specific requests, Cohere has a straightforward process. All the essential details about your account can be found under the Dashboard. This is a great starting point for any account-related queries.

However, if you have a request that requires further assistance or if the changes you wish to make are not covered by the Dashboard, our support team is here to help. Please feel free to reach out directly at support@cohere.com or ask your question in our Discord community.

How can I get in touch with Cohere support to discuss plan options and pricing?

Please reach out to our Sales team at sales@cohere.com

How is the cost of using Cohere's API calculated and what factors influence the number of billed tokens?

Cohere’s API pricing is based on a simple and transparent token-based model. The cost of using the API is calculated based on the number of tokens consumed during the API calls.

Check our pricing page for more information.

What are the rate limits for the free trial API, and how is the monthly limit calculated?

Trial keys are rate-limited depending on the endpoint you want to use, and the monthly limit is 1000 calls per month.

Check our free trial documentation for more information.

Is it possible for a small startup or any commercial entity to use Cohere's technology for production or commercial purposes, and if so, what licenses or permissions are required?

Absolutely! Cohere’s platform empowers businesses, including startups, to leverage our technology for production and commercial purposes.

In terms of usage guidelines, we’ve compiled a comprehensive set of resources to ensure a smooth and compliant experience. You can access these guidelines here.

We’re excited to support your business and its unique needs. If you have any further questions or require additional assistance, please don’t hesitate to reach out to our team at sales@cohere.com or support@cohere.com for more details.

How can I manage my Cohere account, specifically regarding deletion, team invitations, and account merging?

You can access all the necessary tools and information through your account’s dashboard here.

If you’re unable to find the specific feature or information regarding merging accounts, our support team is always eager to help.

Simply start a new chat with them using the chat bubble on our website or reach out via email to support@cohere.com.

How does the token limit work for multiple documents in a single query?

The token limit for multiple documents in a single query can vary depending on the model or service you’re using. For instance, our Chat Model has a long-context window of 128k tokens. This means that as long as the combined length of your input and output tokens stays within this limit, the number of documents you include in your query shouldn’t be an issue.

It’s important to note that different models may have different token and document limits. To ensure you’re working within the appropriate parameters, we’ve provided detailed information about these limits for each model in this model overview section.

We understand that managing token limits can be a crucial aspect of your work, and we’re here to support you in navigating these considerations effectively. If you have any further questions or require additional assistance, please don’t hesitate to reach out to our team at support@cohere.com

What are the pricing plans and models available for Cohere's API endpoints, and are there any additional costs associated with specific features or workflows?

Please find the pricing information about our model and services here.

Should you have any further questions please feel free to reach out to our sales team at sales@cohere.com or support@cohere.com for more details.

Legal, Security, Data Privacy

Is my data private and secure when using Cohere platform, or is it accessible to others?

When you’re using Cohere models via our Platform, we segment your data using logical segmentation. When using Cohere models via a private or cloud deployment from one of our partners, your data is not shared with Cohere.

How can we ensure we follow best practices to secure our system using Cohere, and how can we communicate that to our clients when they raise concerns about potential vulnerabilities associated with using AI?

When it comes to using AI models securely, two important areas stand out.

1. Model Security and Safety

This responsibility lies primarily with the model provider, and at Cohere, we are deeply committed to ensuring responsible AI development. Our team includes some of the top experts in AI security and safety. We lead through various initiatives, including governance and compliance frameworks, safety and security protocols, strict data controls for model training, and industry thought leadership.

2. Secure Application Development with Cohere Models:

While Cohere ensures the model’s security, customers are responsible for building and deploying applications using these models securely. A strong focus on a Secure Product Lifecycle is essential, and our models integrate seamlessly into this process. Core security principles remain as relevant in the AI space as elsewhere. For example, robust authentication protocols should exist for all users, services, and micro-services. Secrets, tokens, and credentials must be tightly controlled and regularly monitored.

Our recommendations:

Implement responsible AI and governance policies in your AI development process, focusing on customer safety and security.
Continuously monitor the performance of your applications and promptly address any issues that arise.

We also regularly share insights and best practices on AI security on our blog. Here are a few examples: 1, 2, 3.

What if I have more questions?

If there’s anything not covered in this document, you’re welcome to reach to us with this form.