Frequently Asked Questions About Cohere

Here, we’ll walk through some common questions we get about how Cohere’s models work, what pricing options there are, and more!

Cohere Models

Command R+ is most suitable for those workflows that lean on complex RAG functionality and multi-step tool use (agents). Command R, on the other hand, is great for simpler retrieval augmented generation (RAG) and single-step tool use tasks, as well as applications where price is a major consideration. We offer a full model overview in our documentation.

Aya specializes in human-like multilingual text generation and conversations, ideal for content creation and chatbots. Command R excels at understanding and executing instructions, enabling interactive applications and data-driven tasks.This makes it more suitable for many enterprise use cases.

You can check out this link to learn more about Aya models, datasets and related research papers.

Cohere’s Command models have strong performance across enterprise tasks such as summarization, multilingual use cases, and retrieval augmented generation. We also have the widest range of deployment options, you can check it here.

You can access Cohere’s models through our platform (cohere.com) or through various cloud platforms including, but not limited to, Sagemaker, Bedrock, Azure AI, and OCI Generatie AI. We also have private deployments. In terms of use case specific features, please reference the latest API documentation to learn more about the API features and Cookbooks with starter code for various tasks to aid development.

You can find our prompt engineering recommendations in the following resources:

To fine-tune models for tasks like data extraction, question answering, or content generation, it’s important to start by defining your goals and ensuring your data captures the task accurately.

For generative models, fine-tuning involves training on input-output pairs, where the model learns to generate specific outputs based on given inputs. This is ideal for tasks like customizing responses or enforcing a particular writing style.

For tasks like data extraction, fine-tuning helps the model identify relevant patterns and structure data as needed. High-quality, task-specific data is essential for achieving accurate results.

For more details, you can refer to Cohere’s fine-tuning guide for best practices.

Fine tuning is a powerful capability, but takes some effort to get right. You should first understand what you are trying to achieve and then determine if the data you are planning to train on effectively captures that task. The generative models specifically learn off of input/output pairs and therefore need to see examples of the expected input for your task and the ideal output. For more information, see our finetuning guide.

You can find the best practices for preparing and structuring fine-tuning data across these three modules. Data preparation for chat fine-tuning, classify fine-tuning and rerank fine-tuning. The primary file formats supported are jsonl and csv.

On the generative side we support fine-tuning for Command R and Command R 082024. On the representation side, we support fine-tuning for Classify and Rerank models. You can learn more about it in this section of our docs.

For the latest current offerings, you should reference our models page.

This largely depends on your use case. In general, Cohere has both generative and representation models. The models page has more information on each of these, but use cases can often use a combination of models.

Cohere models offer a wide range of capabilities, from advanced generative tasks to semantic search and other representation use cases. All of our models are multilingual and can support use cases from RAG to Tool Use and much more.

Our Command model family is our flagship series of generative models. These models excel at taking a user instruction (or command) and generating text following the instruction. They also have conversational capabilities which means that they are well-suited for chatbots and virtual assistants.

For representation tasks, we offer two key models:

  • Embed: Embed models generate embeddings from text, allowing for tasks like classification, clustering, and semantic search.
  • Rerank: Rerank models improve the output of search and ranking systems by re-organizing results according to specific parameters, improving the relevance and accuracy of search results.

Our models perform best when used end-to-end in their intended workflows. For a detailed breakdown of each model, including their latest versions, check our models page.

While this depends on the document structure itself, the best rule of thumb would be to split the PDF into its pages and then split each page into chunks that fit our context length.

From there, you should associate each chunk to a page and a doc id which will allow you to have various levels of granularity for retrieval.

You can find further guides on chunking strategies and handling PDFs with mixed data.

Cohere’s models offer multilingual capabilities out of the box. You can reference our example notebooks such as this RAG one to get a better idea of how to piece these models together to build a question answering application.

We are always looking to expand multilingual support to other languages. Command R/R+ have been exposed to other languages during training and we encourage you to try it on your use case. If you would like to provide feedback or suggestions on additional languages, please don’t hesitate to contact support@cohere.com.

Cohere’s command models are optimized to perform well in the following languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic.

Additionally, pre-training data has been included for the following 13 languages: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, Persian.

You can find a full list of languages that are supported by Cohere’s multilingual embedding model here.

You can check the range of use cases based on our customer stories here.

Model Deployment

You can find the updated cloud support listed in our documentation. Check out links to our models on AWS Bedrock, AWS SageMaker, Azure AI, and OCI Generative AI.

We have the ability to deploy all of our models privately. To learn more, please reach out to the sales team using this form.

Please reach out to the sales team to learn more.

To learn more, please reach out to the sales team using this form.

The default license for our open weights is for non-commercial use. For information about licensing please reach out to the sales team using this form.

Please check our deployment options here and contact our sales team with this form to learn more.

Platform & API

We offer two kinds of API keys: trial keys (with a variety of attendant limitations), and production keys (which have no such limitations). You can learn about them in this section of our documentation.

We make a distinction between “trial” and “production” usage of an API key.

Trial API key usage is free, but limited. You can test different applications or build proofs of concept using all of Cohere’s models and APIs with a trial key by simply signing up for a Cohere account here.

Please refer to API Keys and Rate Limits section of our documentation.

You can contact our support team at support@cohere.com and get help and share your feedback with our team and developer community via the Cohere Discord server.

Getting Started

The Cohere API can be accessed through the SDK and the CLI tool. We support SDKs in 4 different languages, Python, Typescript, Java, and Go.

Visit the API docs for further details.

Here are the relevant links:

You can find the resources as follows:

For learning, we recommend our LLM University hub resources, which have been prepared by Cohere experts. These include a number of very high-quality, step-by-step guides to help you start building quickly.

For building, we recommend checking out our Github Notebooks, as well as the Get Started and Cookbooks sections in our documentation.

For general recommendations on prompt engineering check the following resources:

Make sure to try our Prompt Tuner. It was developed to streamline the process of defining a robust prompt for your use case. Check Prompt Tuner documentation to learn how it works.

For the most reliable results when working with external document sources, we recommend using a technique called Retrieval-Augmented Generation (RAG). You can learn about it here:

You can find a list of comprehensive tutorials and code examples in our LLM University hub and the Cookbook guides.

Check out our Cookbooks, which include step-by-step guides and project examples, and the Cohere Discord server for inspiration from our developer community.

LLMU can be accessed directly from the Cohere website. We periodically add more content and highly recommend you follow us on our socials to stay up to date.

You can find the documentation with the full Cohere model and feature overview here.

Troubleshooting Errors

If you’re encountering difficulties logging into your Cohere dashboard, there could be a few reasons.

First, check our status page at status.cohere.com to see if any known issues or maintenance activities might impact your access.

If the status page doesn’t indicate any ongoing issues, the next step would be to reach out to our support teams. They’re always ready to assist and can be contacted at support@cohere.com. Our support team will be able to investigate further and provide you with the necessary guidance to resolve the login issue.

We understand that login and authentication issues can be frustrating. Here are some steps you can take to troubleshoot and resolve these problems:

  • Check Your Credentials: Ensure you use the correct username and password. It’s easy to make a typo, so double-check your credentials before logging in again.
  • Clear Cache and Cookies: Sometimes, issues with logging in can be caused by cached data or cookies on your device. Try clearing your browser’s cache and cookies, then attempt to log in again.
  • Contact Support: If none of the above steps resolve the issue, it’s time to contact our support team. We are equipped to handle a wide range of login and authentication issues and can provide further assistance. You can contact us at support@cohere.com.

If you’re facing any technical challenges or need guidance, our support team is here to help. Contact us at support@cohere.com, and our technical support engineers will provide the necessary assistance and expertise to resolve your issues.

Billing, Pricing, Licensing, Account Management

Please reach out to our support team at support@cohere.com. When reaching out to the support team, please keep the following questions in mind:

  • What model are you referring to?
  • Copy paste the error message
    • Please note that this is our error message information:
      • 400 - invalid combination of parameters
      • 422 - request is malformed (eg: unsupported enum value, unknown param)
      • 499 - request is canceled by the user
      • 401 - invalid api token (not relevant on AWS)
      • 404 - model not found (not relevant on AWS)
      • 429 - rate limit reached (not relevant on AWS)
  • What is the request seq length you are passing in?
  • What are the generation max tokens you are requesting?
  • Are all the requests of various input/output shapes failing?
  • Share any logs

Please refer to our dedicated pricing page for most up-to-date pricing.

Cohere offers two types of API keys: trial keys and production keys.

Trial Key Limitations

Trial keys are rate-limited depending on the endpoint you want to use. For example, the Embed endpoint is limited to 5 calls per minute, while the Chat endpoint is limited to 20 calls per minute. All other endpoints on trail keys are 1,000 calls per month. If you want to use Cohere endpoints in a production application or require higher throughput, you can upgrade to a production key.

Production Key Specifications

Production keys for all endpoints are rate-limited at 1,000 calls per minute, with unlimited monthly use and are intended for serving Cohere in a public-facing application and testing purposes. Usage of production keys is metered at price points which can be found on the Cohere pricing page.

To get a production key, you’ll need to be the admin of your organization or ask your organization’s admin to create one. Please visit your API Keys > Dashboard, where the process should take less than three minutes and will generate a production key that you can use to serve Cohere APIs in production.

Cohere offers a convenient way to keep track of your usage and billing information. All our endpoints provide this data as metadata for each conversation, which is directly accessible via the API. This ensures you can easily monitor your usage. Our Dashboard provides an additional layer of control for standard accounts. You can set a monthly spending limit to manage your expenses effectively. To learn more about this feature and how to enable it, please visit the Billing & Usage section on the Dashboard, specifically the Spending Limit tab.

If you need to make changes to your account or have specific requests, Cohere has a straightforward process. All the essential details about your account can be found under the Dashboard. This is a great starting point for any account-related queries.

However, if you have a request that requires further assistance or if the changes you wish to make are not covered by the Dashboard, our support team is here to help. Please feel free to reach out directly at support@cohere.com or ask your question in our Discord community.

Please reach out to our Sales team at sales@cohere.com

Cohere’s API pricing is based on a simple and transparent token-based model. The cost of using the API is calculated based on the number of tokens consumed during the API calls.

Check our pricing page for more information.

Trial keys are rate-limited depending on the endpoint you want to use, and the monthly limit is 1000 calls per month.

Check our free trial documentation for more information.

Absolutely! Cohere’s platform empowers businesses, including startups, to leverage our technology for production and commercial purposes.

In terms of usage guidelines, we’ve compiled a comprehensive set of resources to ensure a smooth and compliant experience. You can access these guidelines here.

We’re excited to support your business and its unique needs. If you have any further questions or require additional assistance, please don’t hesitate to reach out to our team at sales@cohere.com or support@cohere.com for more details.

You can access all the necessary tools and information through your account’s dashboard here.

If you’re unable to find the specific feature or information regarding merging accounts, our support team is always eager to help.

Simply start a new chat with them using the chat bubble on our website or reach out via email to support@cohere.com.

The token limit for multiple documents in a single query can vary depending on the model or service you’re using. For instance, our Chat Model has a long-context window of 128k tokens. This means that as long as the combined length of your input and output tokens stays within this limit, the number of documents you include in your query shouldn’t be an issue.

It’s important to note that different models may have different token and document limits. To ensure you’re working within the appropriate parameters, we’ve provided detailed information about these limits for each model in this model overview section.

We understand that managing token limits can be a crucial aspect of your work, and we’re here to support you in navigating these considerations effectively. If you have any further questions or require additional assistance, please don’t hesitate to reach out to our team at support@cohere.com

Please find the pricing information about our model and services here.

Should you have any further questions please feel free to reach out to our sales team at sales@cohere.com or support@cohere.com for more details.

When you’re using Cohere models via our Platform, we segment your data using logical segmentation. When using Cohere models via a private or cloud deployment from one of our partners, your data is not shared with Cohere.

When it comes to using AI models securely, two important areas stand out.

1. Model Security and Safety

This responsibility lies primarily with the model provider, and at Cohere, we are deeply committed to ensuring responsible AI development. Our team includes some of the top experts in AI security and safety. We lead through various initiatives, including governance and compliance frameworks, safety and security protocols, strict data controls for model training, and industry thought leadership.

2. Secure Application Development with Cohere Models:

While Cohere ensures the model’s security, customers are responsible for building and deploying applications using these models securely. A strong focus on a Secure Product Lifecycle is essential, and our models integrate seamlessly into this process. Core security principles remain as relevant in the AI space as elsewhere. For example, robust authentication protocols should exist for all users, services, and micro-services. Secrets, tokens, and credentials must be tightly controlled and regularly monitored.

Our recommendations:

  • Implement responsible AI and governance policies in your AI development process, focusing on customer safety and security.
  • Continuously monitor the performance of your applications and promptly address any issues that arise.

We also regularly share insights and best practices on AI security on our blog. Here are a few examples: 1, 2, 3.

If there’s anything not covered in this document, you’re welcome to reach to us with this form.

Built with