Table of Contents
Fetching ...

Diagnosing Retrieval Bias Under Multiple In-Context Knowledge Updates in Large Language Models

Boyu Qiao, Sean Guo, Xian Yang, Kun Li, Wei Zhou, Songlin Hu, Yunya Song

Abstract

LLMs are widely used in knowledge-intensive tasks where the same fact may be revised multiple times within context. Unlike prior work focusing on one-shot updates or single conflicts, multi-update scenarios contain multiple historically valid versions that compete at retrieval, yet remain underexplored. This challenge resembles the AB-AC interference paradigm in cognitive psychology: when the same cue A is successively associated with B and C, the old and new associations compete during retrieval, leading to bias. Inspired by this, we introduce a Dynamic Knowledge Instance (DKI) evaluation framework, modeling multi-updates of the same fact as a cue paired with a sequence of updated values, and assess models via endpoint probing of the earliest (initial) and latest (current) states. Across diverse LLMs, we observe that retrieval bias intensifies as updates increase, earliest-state accuracy stays high while latest-state accuracy drops substantially. Diagnostic analyses of attention, hidden-state similarity, and output logits further reveal that these signals become flatter and weakly discriminative on errors, providing little stable basis for identifying the latest update. Finally, cognitively inspired heuristic intervention strategies yield only modest gains and do not eliminate the bias. Our results reveal a persistent challenge in tracking and following knowledge updates in long contexts.

Diagnosing Retrieval Bias Under Multiple In-Context Knowledge Updates in Large Language Models

Abstract

LLMs are widely used in knowledge-intensive tasks where the same fact may be revised multiple times within context. Unlike prior work focusing on one-shot updates or single conflicts, multi-update scenarios contain multiple historically valid versions that compete at retrieval, yet remain underexplored. This challenge resembles the AB-AC interference paradigm in cognitive psychology: when the same cue A is successively associated with B and C, the old and new associations compete during retrieval, leading to bias. Inspired by this, we introduce a Dynamic Knowledge Instance (DKI) evaluation framework, modeling multi-updates of the same fact as a cue paired with a sequence of updated values, and assess models via endpoint probing of the earliest (initial) and latest (current) states. Across diverse LLMs, we observe that retrieval bias intensifies as updates increase, earliest-state accuracy stays high while latest-state accuracy drops substantially. Diagnostic analyses of attention, hidden-state similarity, and output logits further reveal that these signals become flatter and weakly discriminative on errors, providing little stable basis for identifying the latest update. Finally, cognitively inspired heuristic intervention strategies yield only modest gains and do not eliminate the bias. Our results reveal a persistent challenge in tracking and following knowledge updates in long contexts.
Paper Structure (33 sections, 10 equations, 15 figures, 3 tables)

This paper contains 33 sections, 10 equations, 15 figures, 3 tables.

Figures (15)

  • Figure 1: Overview of the DKI evaluation framework, internal-signal diagnostics, and cognitively inspired intervention strategies for multi-update knowledge retrieval.
  • Figure 2: Endpoint-probing performance on synthetic DKIs for LLaMA-3.1 and Qwen-2.5/3 model families. Left/middle: earliest and latest state accuracy. right: earliest-latest accuracy gap (ELAG) ($Acc_{earliest} - Acc_{latest}$).
  • Figure 3: Performance on real-world DKIs.
  • Figure 4: Layer-wise and head-wise attention-score over candidate positions (x-axis) on wrong samples.
  • Figure 5: Match rate across layers (x-axis).
  • ...and 10 more figures