Memory-Aware and Uncertainty-Guided Retrieval for Multi-Hop Question Answering
Yuelyu Ji, Rui Meng, Zhuochun Li, Daqing He
TL;DR
The paper addresses multi-hop QA by integrating memory-aware retrieval with uncertainty-guided decision-making. It introduces prompt-based extraction, Retrieval-Integrated Neural Decision-making (RIND) driven by token-level entropy and attention signals, and memory-aware filtering with multiple strategies (No Filtering, CoT, Confidence, Hybrid). Across four datasets, MIND reduces unnecessary retrievals by about 10–15% and improves final answer accuracy (EM/F1), with ablations showing the hybrid CoT+Conf filter offering the best balance. Dynamic thresholding outperforms fixed thresholds, demonstrating adaptive retrieval suitable for varying question complexity. The work advances efficient, consistent multi-hop reasoning and lays groundwork for extensions to conversational and cross-domain QA tasks.
Abstract
Multi-hop question answering (QA) requires models to retrieve and reason over multiple pieces of evidence. While Retrieval-Augmented Generation (RAG) has made progress in this area, existing methods often suffer from two key limitations: (1) fixed or overly frequent retrieval steps, and (2) ineffective use of previously retrieved knowledge. We propose MIND (Memory-Informed and INteractive Dynamic RAG), a framework that addresses these challenges through: (i) prompt-based entity extraction to identify reasoning-relevant elements, (ii) dynamic retrieval triggering based on token-level entropy and attention signals, and (iii) memory-aware filtering, which stores high-confidence facts across reasoning steps to enable consistent multi-hop generation.
