Table of Contents
Fetching ...

Accommodate Knowledge Conflicts in Retrieval-augmented LLMs: Towards Robust Response Generation in the Wild

Jiatai Wang, Zhiwei Xu, Di Jin, Xuewen Yang, Tao Li

TL;DR

This work addresses knowledge conflicts in retrieval-augmented LLMs by framing the problem with information-theoretic uncertainty measures and introducing Swin-VIB, a sliding-window, variational information bottleneck framework that adaptively selects external context. The core insight is that a larger information difference between conflicting and supplementary information, quantified as $\Delta I = I_c - I_s$, reduces generation uncertainty $\mathbb{H}(O\mid R,Q)$, guiding the model toward more reliable answers. Swin-VIB trains per-layer bottlenecks on windows of retrieved context and aggregates their predictions to filter in only high-information-difference windows, enabling robust response generation with minimal overhead. Empirical results across ConflictQA, DRUID, and TruthfulQA show significant improvements in accuracy and open-ended QA EM, while reducing instance- and corpus-level uncertainty, validating the proposed theory and its practical benefits for real-world RAG systems.

Abstract

The proliferation of large language models (LLMs) has significantly advanced intelligent systems. Unfortunately, LLMs often face knowledge conflicts between internal memory and retrieved external information, arising from misinformation, biases, or outdated knowledge. These conflicts undermine response reliability and introduce uncertainty in decision-making. In this work, we analyze how LLMs navigate knowledge conflicts from an information-theoretic perspective and reveal that when conflicting and supplementary information exhibit significant differences, LLMs confidently resolve their preferences and alleviate the uncertainty during their response generation. When this difference is ambiguous, LLMs experience considerable uncertainty about their generation. Based on this insight, we propose Swin-VIB, a novel framework that integrates a pipeline of variational information bottleneck models to adapt the retrieved information difference, facilitating robust response generation of LLMs even in conflicting contexts. Extensive experiments confirm our theoretical analysis and demonstrate the performance of Swin-VIB. Notably, Swin-VIB outperforms all competitive baselines in terms of the accuracy of the multiple-choice task, while improving the EM values in the open-ended QA task by at least 11.14%.

Accommodate Knowledge Conflicts in Retrieval-augmented LLMs: Towards Robust Response Generation in the Wild

TL;DR

This work addresses knowledge conflicts in retrieval-augmented LLMs by framing the problem with information-theoretic uncertainty measures and introducing Swin-VIB, a sliding-window, variational information bottleneck framework that adaptively selects external context. The core insight is that a larger information difference between conflicting and supplementary information, quantified as , reduces generation uncertainty , guiding the model toward more reliable answers. Swin-VIB trains per-layer bottlenecks on windows of retrieved context and aggregates their predictions to filter in only high-information-difference windows, enabling robust response generation with minimal overhead. Empirical results across ConflictQA, DRUID, and TruthfulQA show significant improvements in accuracy and open-ended QA EM, while reducing instance- and corpus-level uncertainty, validating the proposed theory and its practical benefits for real-world RAG systems.

Abstract

The proliferation of large language models (LLMs) has significantly advanced intelligent systems. Unfortunately, LLMs often face knowledge conflicts between internal memory and retrieved external information, arising from misinformation, biases, or outdated knowledge. These conflicts undermine response reliability and introduce uncertainty in decision-making. In this work, we analyze how LLMs navigate knowledge conflicts from an information-theoretic perspective and reveal that when conflicting and supplementary information exhibit significant differences, LLMs confidently resolve their preferences and alleviate the uncertainty during their response generation. When this difference is ambiguous, LLMs experience considerable uncertainty about their generation. Based on this insight, we propose Swin-VIB, a novel framework that integrates a pipeline of variational information bottleneck models to adapt the retrieved information difference, facilitating robust response generation of LLMs even in conflicting contexts. Extensive experiments confirm our theoretical analysis and demonstrate the performance of Swin-VIB. Notably, Swin-VIB outperforms all competitive baselines in terms of the accuracy of the multiple-choice task, while improving the EM values in the open-ended QA task by at least 11.14%.

Paper Structure

This paper contains 51 sections, 34 equations, 15 figures, 8 tables, 1 algorithm.

Figures (15)

  • Figure 1: Illustration of knowledge conflict in RAG.
  • Figure 2: Relationship between uncertainty and the information difference.
  • Figure 3: An overview of response generation with Swin-VIB.
  • Figure 4: Bottleneck model structure and training objective.
  • Figure 5: Prompt templates in conflicting scenarios.
  • ...and 10 more figures

Theorems & Definitions (1)

  • Definition 1: Uncertainty of response generation