Introduction to Text Generation at Cohere

Large language models are impressive for many reasons, but among the most prominent is their ability to quickly generate text. With just a little bit of prompting, they can crank out conceptual explanations, blog posts, web copy, poetry, and almost anything else. Their style can be tweaked to be suitable for children and adults, technical people and laymen, and they can be asked to work in dozens of different natural languages.

In this article, we’ll cover some of the basics of what makes this functionality possible. If you’d like to skip straight to a more hands-on treatment, check out “Using the Chat API.”

How are Large Language Models Trained?

Eliding a great deal of technical complexity, a large language model is just a neural network trained to predict the next token, given the tokens that have come before. Take a sentence like “Hang on, I need to go inside and grab my ___.” As a human being with a great deal of experience using natural language, you can make some reasonable guesses about which token will complete this sentence even with no additional context:

  • “Hang on, I need to go inside and grab my bag."
  • "Hang on, I need to go inside and grab my keys.”
  • Etc.

Of course, there are other possibilities that are plausible, but less likely:

  • “Hang on, I need to go inside and grab my friend."
  • "Hang on, I need to go inside and grab my book.”

And, there’s a long-tail of possibilities that are technically grammatically correct but which effectively never occur in a real exchange:

  • “Hang on, I need to go inside and grab my giraffe.”

You have an intuitive sense of how a sentence like this will end because you’ve been using language all your life. A model like Command R+ must learn how to perform the same feat by seeing billions of token sequences and figuring out a statistical distribution over them that allows it to predict what comes next.

Once it’s done so, it can take a prompt like “Help me generate some titles for a blog post about quantum computing,” and use the distribution it has learned to generate the series of tokens it thinks would follow such a request. Since it’s an AI system generating tokens, it’s known as “generative AI,” and with models as powerful as Cohere’s, the results are often surprisingly good.

Learn More

The rest of the “Text Generation” section of our documentation walks you through how to work with Cohere’s models. Check out “Using the Chat API” to get set up and understand what a response looks like, or reading the streaming guide to figure out how to integrate generative AI into streaming applications.

You might also benefit from reading the retrieval-augmented generation, tool-use, and agent-building guides.