Table of Contents
Fetching ...

NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval

Junchen Li, Rongzheng Wang, Yihong Huang, Qizhi Chen, Jiasheng Zhang, Shuang Liang

TL;DR

NeuroPath tackles the semantic incoherence and noise that plague graph-based Retrieval-Augmented Generation in multi-hop QA by adopting a neurobiology-inspired, LLM-driven semantic path tracking framework. It dynamically constructs goal-directed semantic paths on a knowledge graph and reinforces results via a second-stage retrieval, achieving state-of-the-art recalls on MuSiQue, 2WikiMultiHopQA, and HotpotQA while reducing token usage compared with iterative baselines. The approach combines Static Indexing to build a semantically aware KG, Dynamic Path Tracking to prune and navigate paths toward the query goal, and Post-retrieval Completion to fill gaps in intermediate reasoning. Empirical results demonstrate strong gains across diverse LLMs and task complexities, underscoring the value of semantic coherence as a retrieval principle for complex, knowledge-intensive reasoning.

Abstract

Retrieval-augmented generation (RAG) greatly enhances large language models (LLMs) performance in knowledge-intensive tasks. However, naive RAG methods struggle with multi-hop question answering due to their limited capacity to capture complex dependencies across documents. Recent studies employ graph-based RAG to capture document connections. However, these approaches often result in a loss of semantic coherence and introduce irrelevant noise during node matching and subgraph construction. To address these limitations, we propose NeuroPath, an LLM-driven semantic path tracking RAG framework inspired by the path navigational planning of place cells in neurobiology. It consists of two steps: Dynamic Path Tracking and Post-retrieval Completion. Dynamic Path Tracking performs goal-directed semantic path tracking and pruning over the constructed knowledge graph (KG), improving noise reduction and semantic coherence. Post-retrieval Completion further reinforces these benefits by conducting second-stage retrieval using intermediate reasoning and the original query to refine the query goal and complete missing information in the reasoning path. NeuroPath surpasses current state-of-the-art baselines on three multi-hop QA datasets, achieving average improvements of 16.3% on recall@2 and 13.5% on recall@5 over advanced graph-based RAG methods. Moreover, compared to existing iter-based RAG methods, NeuroPath achieves higher accuracy and reduces token consumption by 22.8%. Finally, we demonstrate the robustness of NeuroPath across four smaller LLMs (Llama3.1, GLM4, Mistral0.3, and Gemma3), and further validate its scalability across tasks of varying complexity. Code is available at https://github.com/KennyCaty/NeuroPath.

NeuroPath: Neurobiology-Inspired Path Tracking and Reflection for Semantically Coherent Retrieval

TL;DR

NeuroPath tackles the semantic incoherence and noise that plague graph-based Retrieval-Augmented Generation in multi-hop QA by adopting a neurobiology-inspired, LLM-driven semantic path tracking framework. It dynamically constructs goal-directed semantic paths on a knowledge graph and reinforces results via a second-stage retrieval, achieving state-of-the-art recalls on MuSiQue, 2WikiMultiHopQA, and HotpotQA while reducing token usage compared with iterative baselines. The approach combines Static Indexing to build a semantically aware KG, Dynamic Path Tracking to prune and navigate paths toward the query goal, and Post-retrieval Completion to fill gaps in intermediate reasoning. Empirical results demonstrate strong gains across diverse LLMs and task complexities, underscoring the value of semantic coherence as a retrieval principle for complex, knowledge-intensive reasoning.

Abstract

Retrieval-augmented generation (RAG) greatly enhances large language models (LLMs) performance in knowledge-intensive tasks. However, naive RAG methods struggle with multi-hop question answering due to their limited capacity to capture complex dependencies across documents. Recent studies employ graph-based RAG to capture document connections. However, these approaches often result in a loss of semantic coherence and introduce irrelevant noise during node matching and subgraph construction. To address these limitations, we propose NeuroPath, an LLM-driven semantic path tracking RAG framework inspired by the path navigational planning of place cells in neurobiology. It consists of two steps: Dynamic Path Tracking and Post-retrieval Completion. Dynamic Path Tracking performs goal-directed semantic path tracking and pruning over the constructed knowledge graph (KG), improving noise reduction and semantic coherence. Post-retrieval Completion further reinforces these benefits by conducting second-stage retrieval using intermediate reasoning and the original query to refine the query goal and complete missing information in the reasoning path. NeuroPath surpasses current state-of-the-art baselines on three multi-hop QA datasets, achieving average improvements of 16.3% on recall@2 and 13.5% on recall@5 over advanced graph-based RAG methods. Moreover, compared to existing iter-based RAG methods, NeuroPath achieves higher accuracy and reduces token consumption by 22.8%. Finally, we demonstrate the robustness of NeuroPath across four smaller LLMs (Llama3.1, GLM4, Mistral0.3, and Gemma3), and further validate its scalability across tasks of varying complexity. Code is available at https://github.com/KennyCaty/NeuroPath.

Paper Structure

This paper contains 26 sections, 6 equations, 16 figures, 14 tables, 1 algorithm.

Figures (16)

  • Figure 1: Comparison between graph-based and path-based RAG methods. To the query: Which company acquired the phone brand created by the Android founder? (a) HippoRAG uses the PPR algorithm to propagate node importance but ignores edge semantics, increasing the risk of retrieving incorrect nodes such as 2008; (b) LightRAG's subgraph construction tends to introduce considerable noise; (c) Our method leverages coherent semantic paths for goal-directed tracking, progressively eliminating noise and tracking the correct answer Nothing.
  • Figure 2: Place cells mechanism. Place cells represent specific spatial locations. During navigation, they preplay upcoming sequences, and during rest, they replay these to support memory consolidation.
  • Figure 3: The overview of NeuroPath's workflow: (1) Static Indexing. Use an LLM to extract entities and relationships to build KG, and build a coreference set for each entity. (2) Dynamic Path Tracking. Using an LLM for goal-directed path tracking. The expansion requirements will be used for pruning. (3) Post-retrieval Completion. Collect documents along the path and leverage intermediate reasoning for second-stage retrieval to complete missing information in the reasoning path.
  • Figure 4: Case studies. Comparison between NeuroPath and the graph-based baselines.
  • Figure 5: Representation forms of cognitive maps in physical and memory space, and their analogy to knowledge representation.
  • ...and 11 more figures