Documents and Citations
With retrieval augmented generation (RAG), it’s possible to feed the model context to ground its replies. Large language models are often quite good at generating sensible output on their own, but they’re well-known to hallucinate factually incorrect, nonsensical, or incomplete information in their replies, which can be problematic for certain use cases.
RAG substantially reduces this problem by giving the model source material to work with. Rather than simply generating an output based on the input prompt, the model can pull information out of this material and incorporate it into its reply.
Here’s an example of using RAG with the Chat endpoint. We’re asking the co.chat()
about penguins, and uploading documents for it to use:
Here’s an example reply:
Observe that the payload includes a list of documents with a “snippet” field containing the information we want the model to use. The recommended length for the snippet of each document is relatively short, 300 words or less. We recommend using field names similar to the ones we’ve included in this example (i.e. “title” and “snippet” ), but RAG is quite flexible with respect to how you structure the documents. You can give the fields any names you want, and can pass in other fields as well, such as a “date” field. All field names and field values are passed to the model.
Also, we can clearly see that it has utilized the document. Our first document says that Emperor penguins are the tallest penguin species, and our second says that Emperor penguins can only be found in Antarctica. The model’s reply, response.message.content[0].text
,successfully synthesizes both of these facts: “The tallest penguins, Emperor penguins, live in Antarctica.”
Finally, note that the output contains a citations object, response.message.citations
, that tells us not only which documents the model relied upon (from the sources
fields), but also the particular part of the claim supported by a particular document (with the start
and end
fields, which are spans that tell us the location of the supported claim inside the reply). This citation object is included because the model was able to use the documents provided, but if it hadn’t been able to do so, no citation object would be present.
You can experiment with RAG in the chat playground.