Table of Contents
Fetching ...

Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint

Xiaowei Yuan, Zhao Yang, Yequan Wang, Shengping Liu, Jun Zhao, Kang Liu

TL;DR

The paper tackles knowledge conflicts that arise when external contextual knowledge clashes with a model's parametric knowledge in large language models. It introduces COIECD, an adaptive decoding framework that detects conflicts via a contextual information-entropy constraint and resolves them with token-specific strategies, guided by a contextual contrastive term. The approach leverages aStable Entropy Hypothesis and Locally Typical Set to discriminate conflicting tokens and uses a softmax-based constraint to modulate decoding, achieving improved faithfulness on conflicting contexts while preserving performance on non-conflicting data. Empirically, COIECD shows robust improvements across realistic and synthetic datasets (NQ, SQuAD, StrategyQA, Counterfacts) and model families (LLaMA2, OPT, FLAN-T5), with careful ablations and hyperparameter analyses confirming the method’s effectiveness and practicality for context-aware NLP systems.

Abstract

Large language models internalize enormous parametric knowledge during pre-training. Concurrently, realistic applications necessitate external contextual knowledge to aid models on the underlying tasks. This raises a crucial dilemma known as knowledge conflicts, where the contextual knowledge clashes with the However, existing decoding works are specialized in resolving knowledge conflicts and could inadvertently deteriorate performance in absence of conflicts. In this paper, we propose an adaptive decoding method, termed as contextual information-entropy constraint decoding (COIECD), to discern whether the knowledge conflicts occur and resolve them. It can improve the model's faithfulness to conflicting context, and simultaneously maintain high performance among non- Our experiments show that COIECD exhibits strong performance and robustness over knowledge conflicts in realistic datasets. Code is available.

Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint

TL;DR

The paper tackles knowledge conflicts that arise when external contextual knowledge clashes with a model's parametric knowledge in large language models. It introduces COIECD, an adaptive decoding framework that detects conflicts via a contextual information-entropy constraint and resolves them with token-specific strategies, guided by a contextual contrastive term. The approach leverages aStable Entropy Hypothesis and Locally Typical Set to discriminate conflicting tokens and uses a softmax-based constraint to modulate decoding, achieving improved faithfulness on conflicting contexts while preserving performance on non-conflicting data. Empirically, COIECD shows robust improvements across realistic and synthetic datasets (NQ, SQuAD, StrategyQA, Counterfacts) and model families (LLaMA2, OPT, FLAN-T5), with careful ablations and hyperparameter analyses confirming the method’s effectiveness and practicality for context-aware NLP systems.

Abstract

Large language models internalize enormous parametric knowledge during pre-training. Concurrently, realistic applications necessitate external contextual knowledge to aid models on the underlying tasks. This raises a crucial dilemma known as knowledge conflicts, where the contextual knowledge clashes with the However, existing decoding works are specialized in resolving knowledge conflicts and could inadvertently deteriorate performance in absence of conflicts. In this paper, we propose an adaptive decoding method, termed as contextual information-entropy constraint decoding (COIECD), to discern whether the knowledge conflicts occur and resolve them. It can improve the model's faithfulness to conflicting context, and simultaneously maintain high performance among non- Our experiments show that COIECD exhibits strong performance and robustness over knowledge conflicts in realistic datasets. Code is available.
Paper Structure (43 sections, 3 theorems, 24 equations, 9 figures, 14 tables)

This paper contains 43 sections, 3 theorems, 24 equations, 9 figures, 14 tables.

Key Result

Proposition 3.1

The information content of a random variable is quantified as its negative log-probability typicalsampling. Let the information content of token $y_t$ be $I(y_t)=-\log p(y_t\mid \boldsymbol{x},\boldsymbol{c},\boldsymbol{y}_{<t})$, and we define a information-entropy shift as: $I(y_t) - \mathcal{H}_{

Figures (9)

  • Figure 1: The illustration of knowledge conflict. Due to model's bias towards its outdated parametric knowledge, it fails to accurately ground answer in the latest context, which conflicts with the LM's knowledge.
  • Figure 2: The illustration of conflicting and non-conflicting scenarios. Existing methods adeptly handle conflicts but struggle to address non-conflicting contexts. The table presented below illustrates the EM scores of existing conflict-solving methods and regular decoding method across diverse conflict ratio data. Numbers within brackets are the discrepancy between Regular and current method. More detailed analyses are in Appendix \ref{['appendix:analysis']}.
  • Figure 3: Above: Based on the contextual information-entropy constraint, tokens that fall into either the lower or upper violation zone of the constraint are typically associated with conflicts. Below: Distinct decoding strategies are employed for conflicting and non-conflicting tokens.
  • Figure 4: Realistic conflicts with Conf. data on NQ
  • Figure 5: Synthetic conflicts with Counterfacts
  • ...and 4 more figures

Theorems & Definitions (6)

  • Proposition 3.1: Bound on information-entropy shift
  • Definition B.1: Locally Typical Set
  • Proposition C.2: Bound on Entropy Shift
  • proof
  • Proposition C.3: Bound on information-entropy shift
  • proof