Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs
Ye Liu, Semih Yavuz, Rui Meng, Meghana Moorthy, Shafiq Joty, Caiming Xiong, Yingbo Zhou
TL;DR
Addresses how to effectively integrate retrieved passages into LLM-based open-domain QA to reduce unknown outputs and mis-answers. It evaluates four integration strategies—two single-round prompts leveraging chain-of-thought reasoning and two multi-round feedback methods (Post-Fusion as Fallback and Concatenation as Distiller)—within a two-stage retriever-LLM pipeline. Experiments on Natural Questions, TriviaQA, and SQuAD Open show that Post-Fusion improves over naive concatenation, and multi-round approaches yield substantial gains, with GPT-4 approaching supervised baselines. The work provides practical guidance on prompt design, passage selection, and decoding for robust retrieval-augmented generation across datasets and settings.
Abstract
The integration of retrieved passages and large language models (LLMs), such as ChatGPTs, has significantly contributed to improving open-domain question answering. However, there is still a lack of exploration regarding the optimal approach for incorporating retrieved passages into the answer generation process. This paper aims to fill this gap by investigating different methods of combining retrieved passages with LLMs to enhance answer generation. We begin by examining the limitations of a commonly-used concatenation approach. Surprisingly, this approach often results in generating "unknown" outputs, even when the correct document is among the top-k retrieved passages. To address this issue, we explore four alternative strategies for integrating the retrieved passages with the LLMs. These strategies include two single-round methods that utilize chain-of-thought reasoning and two multi-round strategies that incorporate feedback loops. Through comprehensive analyses and experiments, we provide insightful observations on how to effectively leverage retrieved passages to enhance the answer generation capability of LLMs.
