For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
  • Get Started
    • Introduction
    • Installation
    • Creating a client
    • Playground
    • FAQs
  • Models
    • An Overview of Cohere's Models
    • Aya
    • Embed
    • Rerank
  • Text Generation
    • Introduction to Text Generation at Cohere
    • Using the Chat API
    • Reasoning
    • Image Inputs
    • Streaming Responses
    • Predictable Outputs
    • Advanced Generation Parameters
    • Tool Use
    • Tokens and Tokenizers
    • Summarizing Text
    • Safety Modes
  • Embeddings (Vectors, Search, Retrieval)
    • Introduction to Embeddings at Cohere
    • Semantic Search with Embeddings
    • Multimodal Embeddings
    • Batch Embedding Jobs
  • Going to Production
    • API Keys and Rate Limits
    • Going Live
    • Deprecations
    • How Does Cohere's Pricing Work?
  • Integrations
    • Integrating Embedding Models with Other Tools
    • Cohere and LangChain
    • LlamaIndex and Cohere
  • Deployment Options
    • Overview
    • SDK Compatibility
  • Tutorials
    • Cookbooks
    • LLM University
    • Build Things with Cohere!
    • Agentic RAG
    • Cohere on Azure
  • Responsible Use
    • Security
    • Usage Policy
    • Command A Technical Report
    • Command R and Command R+ Model Card
  • Cohere Labs
    • Cohere Labs Acceptable Use Policy
  • More Resources
    • Cohere Toolkit
    • Datasets
    • Improve Cohere Docs
LogoLogodocs
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
On this page
  • How are Large Language Models Trained?
  • Learn More
Text Generation

Introduction to Text Generation at Cohere

Was this page helpful?
Edit this page
Previous

Using the Cohere Chat API for Text Generation

Next
Built with

Large language models are impressive for many reasons, but among the most prominent is their ability to quickly generate text. With just a little bit of prompting, they can crank out conceptual explanations, blog posts, web copy, poetry, and almost anything else. Their style can be tweaked to be suitable for children and adults, technical people and laymen, and they can be asked to work in dozens of different natural languages.

In this article, we’ll cover some of the basics of what makes this functionality possible. If you’d like to skip straight to a more hands-on treatment, check out “Using the Chat API.”

How are Large Language Models Trained?

Eliding a great deal of technical complexity, a large language model is just a neural network trained to predict the next token, given the tokens that have come before. Take a sentence like “Hang on, I need to go inside and grab my ___.” As a human being with a great deal of experience using natural language, you can make some reasonable guesses about which token will complete this sentence even with no additional context:

  • “Hang on, I need to go inside and grab my bag.”
  • “Hang on, I need to go inside and grab my keys.”
  • Etc.

Of course, there are other possibilities that are plausible, but less likely:

  • “Hang on, I need to go inside and grab my friend.”
  • “Hang on, I need to go inside and grab my book.”

And, there’s a long-tail of possibilities that are technically grammatically correct but which effectively never occur in a real exchange:

  • “Hang on, I need to go inside and grab my giraffe.”

You have an intuitive sense of how a sentence like this will end because you’ve been using language all your life. A model like Command R+ must learn how to perform the same feat by seeing billions of token sequences and figuring out a statistical distribution over them that allows it to predict what comes next.

Once it’s done so, it can take a prompt like “Help me generate some titles for a blog post about quantum computing,” and use the distribution it has learned to generate the series of tokens it thinks would follow such a request. Since it’s an AI system generating tokens, it’s known as “generative AI,” and with models as powerful as Cohere’s, the results are often surprisingly good.

Learn More

The rest of the “Text Generation” section of our documentation walks you through how to work with Cohere’s models. Check out “Using the Chat API” to get set up and understand what a response looks like, or reading the streaming guide to figure out how to integrate generative AI into streaming applications.

You might also benefit from reading the retrieval-augmented generation, tool-use, and agent-building guides.