Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs

Ye Liu; Semih Yavuz; Rui Meng; Meghana Moorthy; Shafiq Joty; Caiming Xiong; Yingbo Zhou

Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs

Ye Liu, Semih Yavuz, Rui Meng, Meghana Moorthy, Shafiq Joty, Caiming Xiong, Yingbo Zhou

TL;DR

Addresses how to effectively integrate retrieved passages into LLM-based open-domain QA to reduce unknown outputs and mis-answers. It evaluates four integration strategies—two single-round prompts leveraging chain-of-thought reasoning and two multi-round feedback methods (Post-Fusion as Fallback and Concatenation as Distiller)—within a two-stage retriever-LLM pipeline. Experiments on Natural Questions, TriviaQA, and SQuAD Open show that Post-Fusion improves over naive concatenation, and multi-round approaches yield substantial gains, with GPT-4 approaching supervised baselines. The work provides practical guidance on prompt design, passage selection, and decoding for robust retrieval-augmented generation across datasets and settings.

Abstract

The integration of retrieved passages and large language models (LLMs), such as ChatGPTs, has significantly contributed to improving open-domain question answering. However, there is still a lack of exploration regarding the optimal approach for incorporating retrieved passages into the answer generation process. This paper aims to fill this gap by investigating different methods of combining retrieved passages with LLMs to enhance answer generation. We begin by examining the limitations of a commonly-used concatenation approach. Surprisingly, this approach often results in generating "unknown" outputs, even when the correct document is among the top-k retrieved passages. To address this issue, we explore four alternative strategies for integrating the retrieved passages with the LLMs. These strategies include two single-round methods that utilize chain-of-thought reasoning and two multi-round strategies that incorporate feedback loops. Through comprehensive analyses and experiments, we provide insightful observations on how to effectively leverage retrieved passages to enhance the answer generation capability of LLMs.

Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs

TL;DR

Abstract

Paper Structure (17 sections, 2 equations, 9 figures, 2 tables)

This paper contains 17 sections, 2 equations, 9 figures, 2 tables.

Introduction
Problem Setup
Methods
Definition of Unknown Output
Single-Round Approaches
Zero-shot Prompt
Few-shot Prompt
Multi-Round Approaches
Experiments
Results
Usage Analysis
Effect of different Top-k passages from the retriever
Effect of different Decoding Strategies
Effect of the order of the gold passage
Related Work
...and 2 more sections

Figures (9)

Figure 1: Top: Illustration of Concatenation v.s. Post-Fusion strategies. Bottom-a: percentage of unknown responses using the Concatenation strategy. Bottom-b: by varying the number of retrieved passages, (green) percentage of unknown responses, and (red) error rate by majority voting (when the correct answer is in the answer pool, the majority selects a wrong answer).
Figure 2: Diagram of Post-Fusion as the Fallback on top and Concatenation as the Distiller at bottom.
Figure 3: The token usage of different approaches using top-5 passages.
Figure 4: The answer EM performance with the increase of Top-k retrieved passages.
Figure 5: The answer EM performance with the increase of the number of decode output.
...and 4 more figures

Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs

TL;DR

Abstract

Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs

Authors

TL;DR

Abstract

Table of Contents

Figures (9)