Summarizing Text with the Chat Endpoint
Text summarization distills essential information and generates concise snippets from dense documents. With Cohere, you can do text summarization via the Chat endpoint.
The Command R family of models (R and R+) supports 128k context length, so you can pass long documents to be summarized.
Basic summarization
You can perform text summarization with a simple prompt asking the model to summarize a piece of text.
(NOTE: Here, we are passing the document as a variable, but you can also just copy the document directly into the prompt and ask Chat to summarize it.)
Here’s a sample output:
Length control
You can further control the output by defining the length of the summary in your prompt. For example, you can specify the number of sentences to be generated.
And here’s what a sample of the output might look like:
You can also specify the length in terms of word count.
(Note: While the model is generally good at adhering to length instructions, due to the nature of LLMs, we do not guarantee that the exact word, sentence, or paragraph numbers will be generated.)
Format control
Instead of generating summaries as paragraphs, you can also prompt the model to generate the summary as bullet points.
Grounded summarization
Another approach to summarization is using retrieval-augmented generation (RAG). Here, you can instead pass the document as a chunk of documents to the Chat endpoint call.
This approach allows you to take advantage of the citations generated by the endpoint, which means you can get a grounded summary of the document. Each grounded summary includes fine-grained citations linking to the source documents, making the response easily verifiable and building trust with the user.
Here is a chunked version of the document. (we don’t cover the chunking process here, but if you’d like to learn more, see this cookbook on chunking strategies.)
It also helps to create a custom preamble to prime the model about the task—that it will receive a series of text fragments from a document presented in chronological order.
Other than the custom preamble, the only change to the Chat endpoint call is passing the document parameter containing the list of document chunks.
Aside from displaying the actual summary (response.text), we can display the citations as as well (response.citations). The citations are a list of specific passages in the response that cite from the documents that the model receives.
Migrating from Generate to Chat Endpoint
This guide outlines how to migrate from Generate to Chat; the biggest difference is simply the need to replace the prompt
argument with message
, but there’s also no model default, so you’ll have to specify a model.
Migration from Summarize to Chat Endpoint
To use the Command R/R+ models for summarization, we recommend using the Chat endpoint. This guide outlines how to migrate from the Summarize endpoint to the Chat endpoint.