Generating Answers

Introduction

In this module, you’ve learned how to search and retrieve information from large databases in very effective ways. In this chapter, you’ll learn how to combine this with a generative model, in order to get an answer in sentence format, instead of a list of search results.

Large Language Models, as you know, are very good at answering questions, but they are prone to some limitations, such as incorrect information, or even hallucinations. A good way to fix this is to enhance an LLM with a search mechanism.

In short, this combination is done in the following way:
Given a query, the search mechanism retrieves one or more documents containing the answer.
These documents are given to the large language model, and it is instructed to generate an answer based on that information.

I like to imagine this the following way. If I have a question about thermodynamics, I can pick a random friend of mine, and ask them that question. They may or may not get the answer wrong. But if I go and search a few chapters in books about thermodynamics, I give them to my friend, and then I ask them to answer the question based on that, they are much more likely to answer the question correctly.

In this chapter, we'll compare two ways a large language model can answer a question. The first one is by feeding the question to the generative model, and obtaining an answer. This will generate an answer that could be correct, but it may not. This is equivalent to asking your friend a thermodynamics question.

A generative model receives a query and outputs a response. This response may be inaccurate

A generative model receives a query and outputs a response. This response may be inaccurate

The second one starts by using a search system to retrieve documents where the answer to the query is likely to appear. Then we feed the question and the documents to a generative model, and prompt it to answer the question using these documents. This yields a more accurate response.

The query is first given to a search system, which retrieves documents which are likely to contain the answer. Then the query and the documents (context) are fed to the generative model, for a more accurate response.

The query is first given to a search system, which retrieves documents which are likely to contain the answer. Then the query and the documents (context) are fed to the generative model, for a more accurate response.

Colab Notebook

This chapter uses the same Colab notebook, as the previous chapter, and we encourage you to follow it along as you read the chapter.

For the setup, please refer to the Setting Up chapter.

Generating Answers (Without Search)

Let’s first use a generative model to answer a slightly harder question — without search. We are trying to find out how many people won more than one Nobel prize. So, we ask the model the following query.

Query: “How many people have won more than one Nobel prize?”

The answer to this question is five: Marie Curie, Linus Pauling, John Bardeen, Frederick Sanger, and Karl Barry Sharpless.

The way to ask this to the model is with the following line of code, which calls the co.generate endpoint.

prediction_without_search = co.generate(
    prompt=question,
    max_tokens=50,
    num_generations=5)

Since the num_generations parameter is set to 5, that means we get five responses. The max_tokens parameter determines the length of the answer (which is why some answers appear truncated).

Responses:

  • There have been 24 Nobel Prize recipients who have won the prize more than once. Among them, John Bardeen has won the most Nobel Prizes, with four Nobel Prizes in Physics. Marie Curie has won the most Nobel Prizes among women, with the
  • There have been 25 Nobel Prize recipients who have won more than one Nobel Prize. Of these, 14 are men and 11 are women. The most Nobel Prizes won by a single person is four, which was achieved by Marie Curie, who won the
  • There are currently 18 individuals who have received more than one Nobel Prize. Of those, seven are still living.
  • There have been 24 Nobel Laureates who have won the Nobel Prize more than once.
  • There are currently 13 people who have won more than one Nobel Prize. The most Nobel Prizes won by a single person is four, which is the case for Marie Curie. She won the Nobel Prize in Physics in 1903, in 1911, and in

These answers sound like they could be correct, but they’re all wrong. One reason for this is that transformers are good at talking and understanding sentiment and nuisances of the language, etc., but not so good at storing information. As a matter of fact, storing information inside the nodes of the neural network is not something that we can (or should!) fully trust.

Instead, let’s first search for the answer using what we’ve learned in the previous sections of this post.

Searching Answers

In order to find the answer to this question in the Wikipedia dataset (the one we’ve been working with throughout this post), we can use the same dense_retrieval function that we used before. For simplicity, we will only use dense retrieval without Rerank, but we invite you to add it to the lab and see how the results improve!

responses = dense_retrieval(query, num_results=20)

This retrieves the top 20 articles, with their corresponding paragraphs. Here are the top three (remember that the search is done by finding the most similar paragraphs to the query, so some articles may appear several times with different paragraphs).

Responses:

  • Nobel Peace Prize: “, the Peace prize has been awarded to 110 individuals and 27 organizations …
  • Nobel Prize: “The strict rule against awarding a prize to more than three people is also controversial …
  • Nobel Prize: “The prize ceremonies take place annually …

Next, we’ll feed these 20 paragraphs to a generative model, and instruct it to answer the question in sentence format.

Generating an Answer from the Search Results

In order to get the generative model to answer a question based on a certain context, we need to create a prompt. And in this prompt, we need to give it a command and a context. The context will be the concatenation of all the paragraphs retrieved in the search step, which we can obtain using this line of code:

context = [r['text'] for r in responses]

The array context contains a lot of text, and, given the good results we’ve been obtaining with search mechanisms, we are fairly confident that somewhere in this text lies the answer to our original question. Now, we invoke the Generate endpoint. The prompt we’ll use is the following.

prompt = f"""
Use the information provided below to answer the questions at the end. If the answer to the question is not contained in the provided information, say "The answer is not in the context".
---
Context information:
{context}
---
Question: How many people have won more than one Nobel prize?
"""

In other words, we’ve prompted the model to answer the question, but only from information coming from the context array. And if the information is not there, we are prompting the model to state that the answer is not in the context. The following line of code will run the prompt. As before, the parameter num_generations determines the number of answers we want to output, and max_tokens controls the length of the answer.

prediction_with_search = co.generate(
    prompt=prompt,
    num_generations=5,
    max_tokens=50)

The five responses we get are the following (just like before, some of them are truncated):

  • The answer is Five people have received two Nobel Prizes.
  • The answer is Five people have received two Nobel Prizes. Marie Curie received the Physics Prize in 1903 for her work on radioactivity and the Chemistry Prize in 1911 for the isolation of pure radium, making her the only person to be awarded a Nobel
  • The answer is Five people have received two Nobel Prizes.
  • The answer is The Curie family has received the most prizes, with four prizes awarded to five individual laureates. Marie Curie received the prizes in Physics (in 1903) and Chemistry (in 1911). Her husband, Pierre Curie, shared the
  • The answer is Five people have received two Nobel Prizes. Marie Curie received the Physics Prize in 1903 for her work on radioactivity and the Chemistry Prize in 1911 for the isolation of pure radium, making her the only person to be awarded a Nobel

As you can see, this improved the quality of the answers. All of them get the right number of people who received more than one Nobel prize, which is 5.

Conclusion

Generative models are prone to hallucinations. For example, when asked a question, they may answer with an incorrect answer. In this chapter you learned to power a generative model with search, in order to generate more accurate answers and reduce the chance of hallucinations.

Original Source

This material comes from the post Using LLMs for Search with Dense Retrieval and Reranking.


What’s Next

Learn more about search with the following resources!