Migrating Monolithic Prompts to Command-R with RAG
Command-R is a powerful LLM optimized for long context tasks such as retrieval augmented generation (RAG). Migrating a monolithic task such as question-answering or query-focused summarization to RAG can improve the quality of responses due to reduced hallucination and improved conciseness through grounding.
Previously, migrating an existing use case to RAG involved a lot of manual work around indexing documents, implementing at least a basic search strategy, extensive post-processing to introduce proper grounding through citations, and of course fine-tuning an LLM to work well in the RAG paradigm.
This cookbook demonstrates automatic migration of monolithic prompts through two diverse use cases where an original prompt is broken down into two parts: (1) context; and (2) instructions. The former can be done automatically or through simple chunking, while the latter is done automatically by Command-R through single shot prompt optimization.
The two use cases demonstrated here are:
- Autobiography Assistant; and
- Legal Question Answering
Autobiography Assistant
This application scenario is a common LLM-as-assistant use case: given some context, help the user to complete a task. In this case, the task is to write a concise autobiographical summary.
Using Command-R, we can automatically upgrade the original prompt to a RAG-style prompt to get more faithful adherence to the instructions, a clearer and more concise prompt, and in-line citations for free. Consider the following meta-prompt:
Command-R returns with the following:
To extract the returned information, we will write two simple functions to post-process out the JSON and then parse it.
As we can see above, the new prompt is much more concise and gets right to the point. The context has been split into 4 “documents” that Command-R can ground the information to. Now let’s run the same task with the new prompt while leveraging the documents=
parameter. Note that the docs
variable is a list of dict objects with title
describing the contents of a text and snippet
containing the text itself:
The response is concise. More importantly, we can ensure that there is no hallucination because the text is automatically grounded in the input documents. Using the simple function below, we can add this grounding information to the text as citations:
Now let’s move on to an arguably more difficult problem.
Legal Question Answering
On March 21st, the DOJ announced that it is suing Apple for anti-competitive practices. The complaint is 88 pages long and consists of about 230 paragraphs of text. To understand what the suit alleges, a common use case would be to ask for a summary. Because Command-R has a context window of 128K, even an 88-page legal complaint fits comfortably within the window.
We can set up a prompt template that allows us to ask questions on the original text.
The summary seems clear enough. But we are interested in the specific allegations that the DOJ makes. For example, skimming the full complaint, it looks like the DOJ is alleging that Apple could encrypt text messages sent to Android phones if it wanted to do so. We can amend the rendered prompt and ask:
This is a very interesting allegation that at first glance suggests that the model could be hallucinating. Because RAG has been shown to help reduce hallucinations and grounds its responses in the input text, we should convert this prompt to the RAG style paradigm to gain confidence in its response.
While previously we asked Command-R to chunk the text for us, the legal complaint is highly structured with numbered paragraphs so we can use the following function to break the complaint into input docs ready for RAG:
We can now try the same question but ask it directly to Command-R with the chunks as grounding information.
The responses seem similar, but we should add citations and check the citation to get confidence in the response.
The most important passage seems to be paragraph 144. Paragraph 93 is also cited. Let’s check what they contain.
Paragraph 144 indeed contains the important allegation: If Apple wanted to, Apple could allow iPhone users to send encrypted messages to Android users.
In this cookbook we have shown how one can easily take an existing monolithic prompt and migrate it to the RAG paradigm to get less hallucination, grounded information, and in-line citations. We also demonstrated Command-R’s ability to re-write an instruction prompt in a single shot to make it more concise and potentially lead to higher quality completions.