Summarizing Text

Text summarization distills essential information and generates concise snippets from dense documents. With Cohere, you can do text summarization via the Chat endpoint.

The Command R family of models (R and R+) supports 128k context length, so you can pass long documents to be summarized.

Basic summarization

You can perform text summarization with a simple prompt asking the model to summarize a piece of text.

PYTHON
1import cohere
2co = cohere.ClientV2(api_key="<YOUR API KEY>")
3
4document = """Equipment rental in North America is predicted to “normalize” going into 2024,
5according to Josh Nickell, vice president of equipment rental for the American Rental
6Association (ARA).
7“Rental is going back to ‘normal,’ but normal means that strategy matters again -
8geography matters, fleet mix matters, customer type matters,” Nickell said. “In
9late 2020 to 2022, you just showed up with equipment and you made money.
10“Everybody was breaking records, from the national rental chains to the smallest
11rental companies; everybody was having record years, and everybody was raising
12prices. The conversation was, ‘How much are you up?’ And now, the conversation
13is changing to ‘What’s my market like?’”
14Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply
15coming back down to Earth from unprecedented circumstances during the time of Covid.
16Rental companies are still seeing growth, but at a more moderate level."""
17
18message = f"Generate a concise summary of this text\n{document}"
19
20response = co.chat(
21 model="command-r-plus-08-2024",
22 messages=[{"role": "user", "content": message}]
23)
24
25
26print(response.message.content[0].text)

(NOTE: Here, we are passing the document as a variable, but you can also just copy the document directly into the message and ask Chat to summarize it.)

Here’s a sample output:

The equipment rental market in North America is expected to normalize by 2024,
according to Josh Nickell of the American Rental Association. This means a shift
from the unprecedented growth of 2020-2022, where demand and prices were high,
to a more strategic approach focusing on geography, fleet mix, and customer type.
Rental companies are still experiencing growth, but at a more moderate and sustainable level.

Length control

You can further control the output by defining the length of the summary in your prompt. For example, you can specify the number of sentences to be generated.

PYTHON
1message = f"Summarize this text in one sentence\n{document}"
2
3response = co.chat(
4 model="command-r-plus-08-2024",
5 messages=[{"role": "user", "content": message}]
6)
7
8print(response.message.content[0].text)

And here’s what a sample of the output might look like:

The equipment rental market in North America is expected to stabilize in 2024,
with a focus on strategic considerations such as geography, fleet mix, and
customer type, according to Josh Nickell of the American Rental Association (ARA).

You can also specify the length in terms of word count.

PYTHON
1message = f"Summarize this text in less than 10 words\n{document}"
2
3response = co.chat(
4 model="command-r-plus-08-2024",
5 messages=[{"role": "user", "content": message}]
6)
7
8print(response.message.content[0].text)
Rental equipment supply and demand to balance.

(Note: While the model is generally good at adhering to length instructions, due to the nature of LLMs, we do not guarantee that the exact word, sentence, or paragraph numbers will be generated.)

Format control

Instead of generating summaries as paragraphs, you can also prompt the model to generate the summary as bullet points.

PYTHON
1message = f"Generate a concise summary of this text as bullet points\n{document}"
2
3response = co.chat(
4 model="command-r-plus-08-2024",
5 messages=[{"role": "user", "content": message}]
6)
7
8print(response.message.content[0].text)
- Equipment rental in North America is expected to "normalize" by 2024, according to Josh Nickell
of the American Rental Association (ARA).
- This "normalization" means a return to strategic focus on factors like geography, fleet mix,
and customer type.
- In the past two years, rental companies easily made money and saw record growth due to the
unique circumstances of the Covid pandemic.
- Now, the focus is shifting from universal success to varying market conditions and performance.
- Nickell's outlook is not pessimistic; rental companies are still growing, but at a more
sustainable and moderate pace.

Grounded summarization

Another approach to summarization is using retrieval-augmented generation (RAG). Here, you can instead pass the document as a chunk of documents to the Chat endpoint call.

This approach allows you to take advantage of the citations generated by the endpoint, which means you can get a grounded summary of the document. Each grounded summary includes fine-grained citations linking to the source documents, making the response easily verifiable and building trust with the user.

Here is a chunked version of the document. (we don’t cover the chunking process here, but if you’d like to learn more, see this cookbook on chunking strategies.)

PYTHON
1document_chunked = [
2 {
3 "data": {
4 "text": "Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA)."
5 }
6 },
7 {
8 "data": {
9 "text": "“Rental is going back to ‘normal,’ but normal means that strategy matters again - geography matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just showed up with equipment and you made money."
10 }
11 },
12 {
13 "data": {
14 "text": "“Everybody was breaking records, from the national rental chains to the smallest rental companies; everybody was having record years, and everybody was raising prices. The conversation was, ‘How much are you up?’ And now, the conversation is changing to ‘What’s my market like?’”"
15 }
16 },
17]

It also helps to create a custom system message to prime the model about the task—that it will receive a series of text fragments from a document presented in chronological order.

PYTHON
1system_message = """## Task and Context
2You will receive a series of text fragments from a document that are presented in chronological order. As the assistant, you must generate responses to user's requests based on the information given in the fragments. Ensure that your responses are accurate and truthful, and that you reference your sources where appropriate to answer the queries, regardless of their complexity."""

Other than the custom system message, the only change to the Chat endpoint call is passing the document parameter containing the list of document chunks.

Aside from displaying the actual summary, we can display the citations as as well. The citations are a list of specific passages in the response that cite from the documents that the model receives.

PYTHON
1message = f"Summarize this text in one sentence."
2
3response = co.chat(
4 model="command-r-plus-08-2024",
5 documents=document_chunked,
6 messages=[
7 {"role": "system", "content": system_message},
8 {"role": "user", "content": message},
9 ],
10)
11
12print(response.message.content[0].text)
13
14if response.message.citations:
15 print("\nCITATIONS:")
16 for citation in response.message.citations:
17 print(
18 f"Start: {citation.start} | End: {citation.end} | Text: '{citation.text}'",
19 end="",
20 )
21 if citation.sources:
22 for source in citation.sources:
23 print(f"| {source.id}")
Josh Nickell, vice president of the American Rental Association, predicts that equipment rental in North America will "normalize" in 2024, requiring companies to focus on strategy, geography, fleet mix, and customer type.
CITATIONS:
Start: 0 | End: 12 | Text: 'Josh Nickell'| doc:1:0
Start: 14 | End: 63 | Text: 'vice president of the American Rental Association'| doc:1:0
Start: 79 | End: 112 | Text: 'equipment rental in North America'| doc:1:0
Start: 118 | End: 129 | Text: '"normalize"'| doc:1:0
| doc:1:1
Start: 133 | End: 137 | Text: '2024'| doc:1:0
Start: 162 | End: 221 | Text: 'focus on strategy, geography, fleet mix, and customer type.'| doc:1:1
| doc:1:2

Migration from Summarize to Chat Endpoint

To use the Command R/R+ models for summarization, we recommend using the Chat endpoint. This guide outlines how to migrate from the Summarize endpoint to the Chat endpoint.

PYTHON
1# Before
2
3co.summarize(
4 format="bullets",
5 length="short",
6 extractiveness="low",
7 text="""Equipment rental in North America is predicted to “normalize” going into 2024, according
8 to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA).
9 “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography
10 matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you
11 just showed up with equipment and you made money.
12 “Everybody was breaking records, from the national rental chains to the smallest rental companies;
13 everybody was having record years, and everybody was raising prices. The conversation was, ‘How
14 much are you up?’ And now, the conversation is changing to ‘What’s my market like?’”
15 Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back
16 down to Earth from unprecedented circumstances during the time of Covid. Rental companies are
17 still seeing growth, but at a more moderate level.
18 """,
19)
20
21# After
22co.summarize(
23 format="bullets",
24 length="short",
25 extractiveness="low",
26 text="""Equipment rental in North America is predicted to “normalize” going into 2024, according
27 to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA).
28 “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography
29 matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you
30 just showed up with equipment and you made money.
31 “Everybody was breaking records, from the national rental chains to the smallest rental companies;
32 everybody was having record years, and everybody was raising prices. The conversation was, ‘How
33 much are you up?’ And now, the conversation is changing to ‘What’s my market like?’”
34 Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back
35 down to Earth from unprecedented circumstances during the time of Covid. Rental companies are
36 still seeing growth, but at a more moderate level.
37 """,
38)