Table of Contents
Fetching ...

Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework

Lu Chen, Ruqing Zhang, Jiafeng Guo, Yixing Fan, Xueqi Cheng

TL;DR

This work tackles risk control in retrieval-augmented generation (RAG) by introducing RC-RAG, a task that decides whether to keep or abstain from RAG outputs based on predicted confidence tied to retrieved data. It develops a counterfactual prompting framework comprising prompting generation, judgment, and fusion modules to simulate adverse retrieval scenarios (quality and usage) and gauge answer reliability. A new RC-RAG benchmark (RC-TQ, RC-NQ) with four risk metrics (risk, carefulness, alignment, coverage) supports zero-shot evaluation and demonstrates that the framework reduces risk and enhances carefulness across two backbones and two datasets, with interpretability via case studies. The work provides a practical approach for risk-aware RAG deployment and opens avenues for further exploration of external-knowledge uncertainty and efficiency improvements in prompt-based risk control.

Abstract

Retrieval-augmented generation (RAG) has emerged as a popular solution to mitigate the hallucination issues of large language models. However, existing studies on RAG seldom address the issue of predictive uncertainty, i.e., how likely it is that a RAG model's prediction is incorrect, resulting in uncontrollable risks in real-world applications. In this work, we emphasize the importance of risk control, ensuring that RAG models proactively refuse to answer questions with low confidence. Our research identifies two critical latent factors affecting RAG's confidence in its predictions: the quality of the retrieved results and the manner in which these results are utilized. To guide RAG models in assessing their own confidence based on these two latent factors, we develop a counterfactual prompting framework that induces the models to alter these factors and analyzes the effect on their answers. We also introduce a benchmarking procedure to collect answers with the option to abstain, facilitating a series of experiments. For evaluation, we introduce several risk-related metrics and the experimental results demonstrate the effectiveness of our approach. Our code and benchmark dataset are available at https://github.com/ict-bigdatalab/RC-RAG.

Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework

TL;DR

This work tackles risk control in retrieval-augmented generation (RAG) by introducing RC-RAG, a task that decides whether to keep or abstain from RAG outputs based on predicted confidence tied to retrieved data. It develops a counterfactual prompting framework comprising prompting generation, judgment, and fusion modules to simulate adverse retrieval scenarios (quality and usage) and gauge answer reliability. A new RC-RAG benchmark (RC-TQ, RC-NQ) with four risk metrics (risk, carefulness, alignment, coverage) supports zero-shot evaluation and demonstrates that the framework reduces risk and enhances carefulness across two backbones and two datasets, with interpretability via case studies. The work provides a practical approach for risk-aware RAG deployment and opens avenues for further exploration of external-knowledge uncertainty and efficiency improvements in prompt-based risk control.

Abstract

Retrieval-augmented generation (RAG) has emerged as a popular solution to mitigate the hallucination issues of large language models. However, existing studies on RAG seldom address the issue of predictive uncertainty, i.e., how likely it is that a RAG model's prediction is incorrect, resulting in uncontrollable risks in real-world applications. In this work, we emphasize the importance of risk control, ensuring that RAG models proactively refuse to answer questions with low confidence. Our research identifies two critical latent factors affecting RAG's confidence in its predictions: the quality of the retrieved results and the manner in which these results are utilized. To guide RAG models in assessing their own confidence based on these two latent factors, we develop a counterfactual prompting framework that induces the models to alter these factors and analyzes the effect on their answers. We also introduce a benchmarking procedure to collect answers with the option to abstain, facilitating a series of experiments. For evaluation, we introduce several risk-related metrics and the experimental results demonstrate the effectiveness of our approach. Our code and benchmark dataset are available at https://github.com/ict-bigdatalab/RC-RAG.
Paper Structure (23 sections, 3 figures, 7 tables)

This paper contains 23 sections, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Illustration of risk control for RAG. Given a question, a risk controlled RAG model is expected to provide the correct answer if it has knowledge of the question, or alternatively, refuses to answer the question.
  • Figure 2: Overview of counterfactual prompting framework for RAG, in which the counterfactual (CF) prompts challenge the initial RAG answer in terms of the quality or usage of retrieved results. The final judgment result is derived from both aspects. Details refer to Sec. \ref{['sec:method']}.
  • Figure 3: The change of risk-related metrics with the increase of iteration number.