NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval
Junchen Li, Rongzheng Wang, Yihong Huang, Qizhi Chen, Jiasheng Zhang, Shuang Liang
TL;DR
NeuroPath tackles the semantic incoherence and noise that plague graph-based Retrieval-Augmented Generation in multi-hop QA by adopting a neurobiology-inspired, LLM-driven semantic path tracking framework. It dynamically constructs goal-directed semantic paths on a knowledge graph and reinforces results via a second-stage retrieval, achieving state-of-the-art recalls on MuSiQue, 2WikiMultiHopQA, and HotpotQA while reducing token usage compared with iterative baselines. The approach combines Static Indexing to build a semantically aware KG, Dynamic Path Tracking to prune and navigate paths toward the query goal, and Post-retrieval Completion to fill gaps in intermediate reasoning. Empirical results demonstrate strong gains across diverse LLMs and task complexities, underscoring the value of semantic coherence as a retrieval principle for complex, knowledge-intensive reasoning.
Abstract
Retrieval-augmented generation (RAG) greatly enhances large language models (LLMs) performance in knowledge-intensive tasks. However, naive RAG methods struggle with multi-hop question answering due to their limited capacity to capture complex dependencies across documents. Recent studies employ graph-based RAG to capture document connections. However, these approaches often result in a loss of semantic coherence and introduce irrelevant noise during node matching and subgraph construction. To address these limitations, we propose NeuroPath, an LLM-driven semantic path tracking RAG framework inspired by the path navigational planning of place cells in neurobiology. It consists of two steps: Dynamic Path Tracking and Post-retrieval Completion. Dynamic Path Tracking performs goal-directed semantic path tracking and pruning over the constructed knowledge graph (KG), improving noise reduction and semantic coherence. Post-retrieval Completion further reinforces these benefits by conducting second-stage retrieval using intermediate reasoning and the original query to refine the query goal and complete missing information in the reasoning path. NeuroPath surpasses current state-of-the-art baselines on three multi-hop QA datasets, achieving average improvements of 16.3% on recall@2 and 13.5% on recall@5 over advanced graph-based RAG methods. Moreover, compared to existing iter-based RAG methods, NeuroPath achieves higher accuracy and reduces token consumption by 22.8%. Finally, we demonstrate the robustness of NeuroPath across four smaller LLMs (Llama3.1, GLM4, Mistral0.3, and Gemma3), and further validate its scalability across tasks of varying complexity. Code is available at https://github.com/KennyCaty/NeuroPath.
