Knowledge Integration Decay in Search-Augmented Reasoning of Large Language Models
Sangwon Yu, Ik-hwan Kim, Donghun Kang, Bongkyu Hwang, Junhwa Choi, Suk-hoon Jung, Seungki Hong, Taehee Lee, Sungroh Yoon
TL;DR
This work identifies Knowledge Integration Decay (KID) as a key bottleneck in search-augmented reasoning, where longer pre-search reasoning undermines the integration of retrieved evidence. It introduces Self-Anchored Knowledge Encoding (SAKE), a training-free inference-time strategy that anchors retrieved knowledge at the start and after each reasoning step to preserve semantic integrity while retaining essential reasoning signals. Through extensive experiments on multi-hop QA and knowledge-intensive tasks, SAKE consistently mitigates KID and yields substantial gains across models and benchmarks, including significant improvements on challenging bridge- and composition-based questions. The authors provide mechanistic analyses, ablations, and information-theoretic interpretations to justify SAKE’s effectiveness and argue that resolving KID is essential for scalable, reliable agentic LLM reasoning, with practical trade-offs in context length and future opportunities for training-time solutions.
Abstract
Modern Large Language Models (LLMs) have demonstrated remarkable capabilities in complex tasks by employing search-augmented reasoning to incorporate external knowledge into long chains of thought. However, we identify a critical yet underexplored bottleneck in this paradigm, termed Knowledge Integration Decay (KID). Specifically, we observe that as the length of reasoning generated before search grows, models increasingly fail to integrate retrieved evidence into subsequent reasoning steps, limiting performance even when relevant information is available. To address this, we propose Self-Anchored Knowledge Encoding (SAKE), a training-free inference-time strategy designed to stabilize knowledge utilization. By anchoring retrieved knowledge at both the beginning and end of the reasoning process, SAKE prevents it from being overshadowed by prior context, thereby preserving its semantic integrity. Extensive experiments on multi-hop QA and complex reasoning benchmarks demonstrate that SAKE significantly mitigates KID and improves performance, offering a lightweight yet effective solution for knowledge integration in agentic LLMs.
