Table of Contents
Fetching ...

Tug-of-War Between Knowledge: Exploring and Resolving Knowledge Conflicts in Retrieval-Augmented Language Models

Zhuoran Jin, Pengfei Cao, Yubo Chen, Kang Liu, Xiaojian Jiang, Jiexin Xu, Qiuxia Li, Jun Zhao

TL;DR

This work investigates knowledge conflicts in retrieval-augmented language models (RALMs), defining an evaluation framework that measures correctness, faithfulness, and memorization across open-domain, entity-centric, and multi-hop QA settings. It reveals robust phenomena, including a Dunning-Kruger effect in capable models, availability bias toward common knowledge, majority-rule tendencies in evidence selection, and confirmation bias when external evidence aligns with internal memory. To address these conflicts, the authors introduce Conflict-Disentangle Contrastive Decoding (CD2), a confidence-calibration method that improves resilience to conflicting internal/external knowledge without retraining; it also leverages fact-aware instruction tuning to distinguish truthful from misleading evidence. Empirical results show CD2 significantly improves recall under conflict scenarios, including substantial gains when external evidence conflicts with or supports internal memory, suggesting practical value for deploying RALMs in dynamic information environments. The findings advance understanding of how to mitigate knowledge conflicts in RALMs and provide a concrete, scalable approach to improve trustworthiness and reliability in retrieval-augmented reasoning systems.

Abstract

Retrieval-augmented language models (RALMs) have demonstrated significant potential in refining and expanding their internal memory by retrieving evidence from external sources. However, RALMs will inevitably encounter knowledge conflicts when integrating their internal memory with external sources. Knowledge conflicts can ensnare RALMs in a tug-of-war between knowledge, limiting their practical applicability. In this paper, we focus on exploring and resolving knowledge conflicts in RALMs. First, we present an evaluation framework for assessing knowledge conflicts across various dimensions. Then, we investigate the behavior and preference of RALMs from the following two perspectives: (1) Conflicts between internal memory and external sources: We find that stronger RALMs emerge with the Dunning-Kruger effect, persistently favoring their faulty internal memory even when correct evidence is provided. Besides, RALMs exhibit an availability bias towards common knowledge; (2) Conflicts between truthful, irrelevant and misleading evidence: We reveal that RALMs follow the principle of majority rule, leaning towards placing trust in evidence that appears more frequently. Moreover, we find that RALMs exhibit confirmation bias, and are more willing to choose evidence that is consistent with their internal memory. To solve the challenge of knowledge conflicts, we propose a method called Conflict-Disentangle Contrastive Decoding (CD2) to better calibrate the model's confidence. Experimental results demonstrate that our CD2 can effectively resolve knowledge conflicts in RALMs.

Tug-of-War Between Knowledge: Exploring and Resolving Knowledge Conflicts in Retrieval-Augmented Language Models

TL;DR

This work investigates knowledge conflicts in retrieval-augmented language models (RALMs), defining an evaluation framework that measures correctness, faithfulness, and memorization across open-domain, entity-centric, and multi-hop QA settings. It reveals robust phenomena, including a Dunning-Kruger effect in capable models, availability bias toward common knowledge, majority-rule tendencies in evidence selection, and confirmation bias when external evidence aligns with internal memory. To address these conflicts, the authors introduce Conflict-Disentangle Contrastive Decoding (CD2), a confidence-calibration method that improves resilience to conflicting internal/external knowledge without retraining; it also leverages fact-aware instruction tuning to distinguish truthful from misleading evidence. Empirical results show CD2 significantly improves recall under conflict scenarios, including substantial gains when external evidence conflicts with or supports internal memory, suggesting practical value for deploying RALMs in dynamic information environments. The findings advance understanding of how to mitigate knowledge conflicts in RALMs and provide a concrete, scalable approach to improve trustworthiness and reliability in retrieval-augmented reasoning systems.

Abstract

Retrieval-augmented language models (RALMs) have demonstrated significant potential in refining and expanding their internal memory by retrieving evidence from external sources. However, RALMs will inevitably encounter knowledge conflicts when integrating their internal memory with external sources. Knowledge conflicts can ensnare RALMs in a tug-of-war between knowledge, limiting their practical applicability. In this paper, we focus on exploring and resolving knowledge conflicts in RALMs. First, we present an evaluation framework for assessing knowledge conflicts across various dimensions. Then, we investigate the behavior and preference of RALMs from the following two perspectives: (1) Conflicts between internal memory and external sources: We find that stronger RALMs emerge with the Dunning-Kruger effect, persistently favoring their faulty internal memory even when correct evidence is provided. Besides, RALMs exhibit an availability bias towards common knowledge; (2) Conflicts between truthful, irrelevant and misleading evidence: We reveal that RALMs follow the principle of majority rule, leaning towards placing trust in evidence that appears more frequently. Moreover, we find that RALMs exhibit confirmation bias, and are more willing to choose evidence that is consistent with their internal memory. To solve the challenge of knowledge conflicts, we propose a method called Conflict-Disentangle Contrastive Decoding (CD2) to better calibrate the model's confidence. Experimental results demonstrate that our CD2 can effectively resolve knowledge conflicts in RALMs.
Paper Structure (27 sections, 2 equations, 7 figures, 4 tables)

This paper contains 27 sections, 2 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: An example of knowledge conflicts.
  • Figure 2: Confidence score distributions of different models under knowledge conflicts between internal memory and external sources on NQ. w/ C denotes providing conflicting evidence to the model. The wider the violin plot, the denser the data points. The larger the log likelihood, the higher the confidence score.
  • Figure 3: Recall under knowledge conflicts with various entity popularities on PopQA. GR denotes the gold references, CR denotes the conflicting references, and OM denotes the internal memory predictions.
  • Figure 4: Recall of LLaMA2 7B with various amounts of evidence and different conflicting ratios on NQ.
  • Figure 5: Frequency of choosing evidence aligning with or conflicting with internal memory on NQ.
  • ...and 2 more figures