Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework
Lu Chen, Ruqing Zhang, Jiafeng Guo, Yixing Fan, Xueqi Cheng
TL;DR
This work tackles risk control in retrieval-augmented generation (RAG) by introducing RC-RAG, a task that decides whether to keep or abstain from RAG outputs based on predicted confidence tied to retrieved data. It develops a counterfactual prompting framework comprising prompting generation, judgment, and fusion modules to simulate adverse retrieval scenarios (quality and usage) and gauge answer reliability. A new RC-RAG benchmark (RC-TQ, RC-NQ) with four risk metrics (risk, carefulness, alignment, coverage) supports zero-shot evaluation and demonstrates that the framework reduces risk and enhances carefulness across two backbones and two datasets, with interpretability via case studies. The work provides a practical approach for risk-aware RAG deployment and opens avenues for further exploration of external-knowledge uncertainty and efficiency improvements in prompt-based risk control.
Abstract
Retrieval-augmented generation (RAG) has emerged as a popular solution to mitigate the hallucination issues of large language models. However, existing studies on RAG seldom address the issue of predictive uncertainty, i.e., how likely it is that a RAG model's prediction is incorrect, resulting in uncontrollable risks in real-world applications. In this work, we emphasize the importance of risk control, ensuring that RAG models proactively refuse to answer questions with low confidence. Our research identifies two critical latent factors affecting RAG's confidence in its predictions: the quality of the retrieved results and the manner in which these results are utilized. To guide RAG models in assessing their own confidence based on these two latent factors, we develop a counterfactual prompting framework that induces the models to alter these factors and analyzes the effect on their answers. We also introduce a benchmarking procedure to collect answers with the option to abstain, facilitating a series of experiments. For evaluation, we introduce several risk-related metrics and the experimental results demonstrate the effectiveness of our approach. Our code and benchmark dataset are available at https://github.com/ict-bigdatalab/RC-RAG.
