Summarizing Text

Text summarization distills essential information and generates concise snippets from dense documents. With Cohere, you can do text summarization via the Chat endpoint.

The Command R family of models (R and R+) supports 128k context length, so you can pass long documents to be summarized.

Basic summarization

You can perform text summarization with a simple prompt asking the model to summarize a piece of text.

PYTHON
1document = """Equipment rental in North America is predicted to “normalize” going into 2024,
2according to Josh Nickell, vice president of equipment rental for the American Rental
3Association (ARA).
4“Rental is going back to ‘normal,’ but normal means that strategy matters again -
5geography matters, fleet mix matters, customer type matters,” Nickell said. “In
6late 2020 to 2022, you just showed up with equipment and you made money.
7“Everybody was breaking records, from the national rental chains to the smallest
8rental companies; everybody was having record years, and everybody was raising
9prices. The conversation was, ‘How much are you up?’ And now, the conversation
10is changing to ‘What’s my market like?’”
11Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply
12coming back down to Earth from unprecedented circumstances during the time of Covid.
13Rental companies are still seeing growth, but at a more moderate level."""
14
15
16
17response = co.chat(message= f"Generate a concise summary of this text\n{document}").text
18
19
20print(response)

(NOTE: Here, we are passing the document as a variable, but you can also just copy the document directly into the prompt and ask Chat to summarize it.)

Here’s a sample output:

The equipment rental market in North America is expected to normalize by 2024,
according to Josh Nickell of the American Rental Association. This means a shift
from the unprecedented growth of 2020-2022, where demand and prices were high,
to a more strategic approach focusing on geography, fleet mix, and customer type.
Rental companies are still experiencing growth, but at a more moderate and sustainable level.

Length control

You can further control the output by defining the length of the summary in your prompt. For example, you can specify the number of sentences to be generated.

PYTHON
1response = co.chat(message= f"Summarize this text in one sentence\n{document}").text
2
3print(response)

And here’s what a sample of the output might look like:

The equipment rental market in North America is expected to stabilize in 2024,
with a focus on strategic considerations such as geography, fleet mix, and
customer type, according to Josh Nickell of the American Rental Association (ARA).

You can also specify the length in terms of word count.

PYTHON
1response = co.chat(message= f"Summarize this text in less than 10 words\n{document}").text
2
3print(response)
Rental equipment supply and demand to balance.

(Note: While the model is generally good at adhering to length instructions, due to the nature of LLMs, we do not guarantee that the exact word, sentence, or paragraph numbers will be generated.)

Format control

Instead of generating summaries as paragraphs, you can also prompt the model to generate the summary as bullet points.

PYTHON
1response = co.chat(message= f"Generate a concise summary of this text as bullet points\n{document}").text
2
3print(response)
- Equipment rental in North America is expected to "normalize" by 2024, according to Josh Nickell
of the American Rental Association (ARA).
- This "normalization" means a return to strategic focus on factors like geography, fleet mix,
and customer type.
- In the past two years, rental companies easily made money and saw record growth due to the
unique circumstances of the Covid pandemic.
- Now, the focus is shifting from universal success to varying market conditions and performance.
- Nickell's outlook is not pessimistic; rental companies are still growing, but at a more
sustainable and moderate pace.

Grounded summarization

Another approach to summarization is using retrieval-augmented generation (RAG). Here, you can instead pass the document as a chunk of documents to the Chat endpoint call.

This approach allows you to take advantage of the citations generated by the endpoint, which means you can get a grounded summary of the document. Each grounded summary includes fine-grained citations linking to the source documents, making the response easily verifiable and building trust with the user.

Here is a chunked version of the document. (we don’t cover the chunking process here, but if you’d like to learn more, see this cookbook on chunking strategies.)

PYTHON
1document_chunked = [{"text": "Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA)."},
2{"text": "“Rental is going back to ‘normal,’ but normal means that strategy matters again - geography matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just showed up with equipment and you made money."},
3{"text": "“Everybody was breaking records, from the national rental chains to the smallest rental companies; everybody was having record years, and everybody was raising prices. The conversation was, ‘How much are you up?’ And now, the conversation is changing to ‘What’s my market like?’”"}]

It also helps to create a custom preamble to prime the model about the task—that it will receive a series of text fragments from a document presented in chronological order.

PYTHON
1preamble = """## Task & Context
2You will receive a series of text fragments from a document that are presented in chronological order. \
3As the assistant, you must generate responses to user's requests based on the information given in the fragments. \
4Ensure that your responses are accurate and truthful, and that you reference your sources where appropriate to answer \
5the queries, regardless of their complexity."""

Other than the custom preamble, the only change to the Chat endpoint call is passing the document parameter containing the list of document chunks.

Aside from displaying the actual summary (response.text), we can display the citations as as well (response.citations). The citations are a list of specific passages in the response that cite from the documents that the model receives.

PYTHON
1response = co.chat(message= f"Summarize this text in two sentences.", preamble=preamble, documents=document_chunked)
2print(response.text)
3
4# Print citations (if any)
5if response.citations:
6 print("\nCitations:")
7 for citation in response.citations:
8 print(citation)
9 print("\nCited Documents:")
10 for document in response.documents:
11 print(document)
Josh Nickell, vice president of the American Rental Association, predicts that equipment rental in North America will "normalize" by 2024. This means that factors like geography, fleet mix, and customer type will influence success in the market.
Citations:
start=0 end=4 text='Josh' document_ids=['doc_0']
start=5 end=12 text='Nickell' document_ids=['doc_0', 'doc_1']
start=14 end=63 text='vice president of the American Rental Association' document_ids=['doc_0']
start=79 end=112 text='equipment rental in North America' document_ids=['doc_0']
start=118 end=129 text='"normalize"' document_ids=['doc_0', 'doc_1']
start=133 end=138 text='2024.' document_ids=['doc_0']
start=168 end=245 text='geography, fleet mix, and customer type will influence success in the market.' document_ids=['doc_1']
Cited Documents:
{'id': 'doc_0', 'text': 'Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA).'}
{'id': 'doc_1', 'text': '“Rental is going back to ‘normal,’ but normal means that strategy matters again - geography matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just showed up with equipment and you made money.'}

Migrating from Generate to Chat Endpoint

This guide outlines how to migrate from Generate to Chat; the biggest difference is simply the need to replace the prompt argument with message, but there’s also no model default, so you’ll have to specify a model.

PYTHON
1# Before
2
3co.generate(
4 prompt="""Write a short summary from the following text in bullet point format, in different
5 words.
6
7 Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA).
8 “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just showed up with equipment and you made money.
9 “Everybody was breaking records, from the national rental chains to the smallest rental companies; everybody was having record years, and everybody was raising prices. The conversation was, ‘How much are you up?’ And now, the conversation is changing to ‘What’s my market like?’”
10 Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back down to Earth from unprecedented circumstances during the time of Covid. Rental companies are still seeing growth, but at a more moderate level.
11 """
12)
13
14# After
15co.chat(
16 message="""Write a short summary from the following text in bullet point format,
17 in different words.
18
19 Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA).
20 “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just showed up with equipment and you made money.
21 “Everybody was breaking records, from the national rental chains to the smallest rental companies; everybody was having record years, and everybody was raising prices. The conversation was, ‘How much are you up?’ And now, the conversation is changing to ‘What’s my market like?’”
22 Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back down to Earth from unprecedented circumstances during the time of Covid. Rental companies are still seeing growth, but at a more moderate level.
23 """,
24 model="command-r-plus"
25)

Migration from Summarize to Chat Endpoint

To use the Command R/R+ models for summarization, we recommend using the Chat endpoint. This guide outlines how to migrate from the Summarize endpoint to the Chat endpoint.

PYTHON
1# Before
2
3co.summarize(
4 format="bullets",
5 length="short",
6 extractiveness="low",
7 text="""Equipment rental in North America is predicted to “normalize” going into 2024, according
8 to Josh Nickell, vice president of equipment rental for the American Rental Association (ARA).
9 “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography
10 matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you
11 just showed up with equipment and you made money.
12 “Everybody was breaking records, from the national rental chains to the smallest rental companies;
13 everybody was having record years, and everybody was raising prices. The conversation was, ‘How
14 much are you up?’ And now, the conversation is changing to ‘What’s my market like?’”
15 Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back
16 down to Earth from unprecedented circumstances during the time of Covid. Rental companies are
17 still seeing growth, but at a more moderate level.
18 """
19)
20
21# After
22co.chat(
23 message="""Write a short summary from the following text in bullet point format, in different words.
24
25 Equipment rental in North America is predicted to “normalize” going into 2024, according to Josh
26 Nickell, vice president of equipment rental for the American Rental Association (ARA).
27 “Rental is going back to ‘normal,’ but normal means that strategy matters again - geography
28 matters, fleet mix matters, customer type matters,” Nickell said. “In late 2020 to 2022, you just
29 showed up with equipment and you made money.
30 “Everybody was breaking records, from the national rental chains to the smallest rental companies;
31 everybody was having record years, and everybody was raising prices. The conversation was,
32 ‘How much are you up?’ And now, the conversation is changing to ‘What’s my market like?’”
33 Nickell stressed this shouldn’t be taken as a pessimistic viewpoint. It’s simply coming back
34 down to Earth from unprecedented circumstances during the time of Covid. Rental companies are
35 still seeing growth, but at a more moderate level.
36 """,
37 model="command-r-plus"
38)