Table of Contents
Fetching ...

Improving Generated and Retrieved Knowledge Combination Through Zero-shot Generation

Xinkai Du, Quanjie Han, Chao Lv, Yan Liu, Yalin Sun, Hao Shu, Hongbo Shan, Maosong Sun

TL;DR

BRMGR tackles the label-scarce challenge of merging retrieved and LLM-generated knowledge for open-domain QA by introducing an unsupervised bi-reranking framework. It scores retrieved passages with $p(\\mathbf{q} |\\mathbf{rp}_j)$ and generated passages with $p(\\mathbf{lp}_i |\\mathbf{q})$, then forms cross-source relevance as $p(\\mathbf{lp}_i,\\mathbf{rp}_j |\\mathbf{q}) \\propto p(\\mathbf{lp}_i |\\mathbf{q}) p(\\mathbf{q} |\\mathbf{rp}_j)$ and applies greedy matching, which is equivalent to a bipartite matching loss under a product factorization. The method leverages zero-shot generation to estimate compatibility and avoids silver-label mining. Evaluations on TriviaQA, Natural Questions, and WebQuestions show consistent gains over single-source baselines and competitive results against strong baselines in both retrieval and QA, with ablations highlighting the importance of document-generation-based reranking and the choice of PLMs.

Abstract

Open-domain Question Answering (QA) has garnered substantial interest by combining the advantages of faithfully retrieved passages and relevant passages generated through Large Language Models (LLMs). However, there is a lack of definitive labels available to pair these sources of knowledge. In order to address this issue, we propose an unsupervised and simple framework called Bi-Reranking for Merging Generated and Retrieved Knowledge (BRMGR), which utilizes re-ranking methods for both retrieved passages and LLM-generated passages. We pair the two types of passages using two separate re-ranking methods and then combine them through greedy matching. We demonstrate that BRMGR is equivalent to employing a bipartite matching loss when assigning each retrieved passage with a corresponding LLM-generated passage. The application of our model yielded experimental results from three datasets, improving their performance by +1.7 and +1.6 on NQ and WebQ datasets, respectively, and obtaining comparable result on TriviaQA dataset when compared to competitive baselines.

Improving Generated and Retrieved Knowledge Combination Through Zero-shot Generation

TL;DR

BRMGR tackles the label-scarce challenge of merging retrieved and LLM-generated knowledge for open-domain QA by introducing an unsupervised bi-reranking framework. It scores retrieved passages with and generated passages with , then forms cross-source relevance as and applies greedy matching, which is equivalent to a bipartite matching loss under a product factorization. The method leverages zero-shot generation to estimate compatibility and avoids silver-label mining. Evaluations on TriviaQA, Natural Questions, and WebQuestions show consistent gains over single-source baselines and competitive results against strong baselines in both retrieval and QA, with ablations highlighting the importance of document-generation-based reranking and the choice of PLMs.

Abstract

Open-domain Question Answering (QA) has garnered substantial interest by combining the advantages of faithfully retrieved passages and relevant passages generated through Large Language Models (LLMs). However, there is a lack of definitive labels available to pair these sources of knowledge. In order to address this issue, we propose an unsupervised and simple framework called Bi-Reranking for Merging Generated and Retrieved Knowledge (BRMGR), which utilizes re-ranking methods for both retrieved passages and LLM-generated passages. We pair the two types of passages using two separate re-ranking methods and then combine them through greedy matching. We demonstrate that BRMGR is equivalent to employing a bipartite matching loss when assigning each retrieved passage with a corresponding LLM-generated passage. The application of our model yielded experimental results from three datasets, improving their performance by +1.7 and +1.6 on NQ and WebQ datasets, respectively, and obtaining comparable result on TriviaQA dataset when compared to competitive baselines.

Paper Structure

This paper contains 13 sections, 1 theorem, 7 equations, 3 figures, 3 tables.

Key Result

Theorem 1

If we assume that the combination relevance score of $p(\textbf{lp}_i,\textbf{rp}_j|\textbf{q})$ can be factorized into the relevance scores of the generated knowledge and the retrieved knowledge, and further, if the number of both types of knowledge is the same, then we can conclude that the optima

Figures (3)

  • Figure 1: Top-3 retrieval exact match score for single knowledge source after reranking knowledge sources.
  • Figure 2: Overview of the BRMGR Framework: It uses an unsupervised method to rerank both LLM-generated and retrieved knowledge for Open-Domain QA. We compute the relevance score of retrieved knowledge based on the log-likelihood score of the query conditioned on retrieved knowledge, and compute the relevance score of generated knowledge via the log-likelihood score of generated knowledge conditioned on query. Finally, the retrieved knowledge and generated knowledge are combined using a greedy matching approach.
  • Figure 3: Comparison of two passage re-ranking approaches on the NQ development set: (1) when generating question tokens conditioned on the passage $p(\textbf{q}|\textbf{lp})$, and (2) when generating passage tokens conditioned on the question $p(\textbf{lp}|\textbf{q})$. Results highlight the usefulness of document generation in generated knowledge for reranking.

Theorems & Definitions (2)

  • Theorem 1
  • proof