Table of Contents
Fetching ...

IGMiRAG: Intuition-Guided Retrieval-Augmented Generation with Adaptive Mining of In-Depth Memory

Xingliang Hou, Yuyan Liu, Qi Sun, haoxiu wang, Hao Hu, Shaoyi Du, Zhiqiang Tian

TL;DR

IGMiRAG addresses memory fragmentation in retrieval-augmented generation by introducing a Hierarchical Heterogeneous Hypergraph that unifies multi-granular knowledge and encodes deductive pathways. A Retrieval-Strategy Parser derives intuitive query strategies, while a Dual-Focus Index mitigates cross-type semantic drift during retrieval. Intuitive Anchors Retrieval, via BM25 and DF-Retrieval with Reciprocal Rank Fusion, seeds the memory mining process, which is then carried out by a Preference-Aware Bidirectional Diffusion over the hierarchy with adaptive context windows. Across six benchmarks, IGMiRAG achieves consistent improvements in EM and F1 (average gains of 4.8% and 5.0%, respectively) and exhibits cost-effective token usage, especially on multi-hop tasks, demonstrating a practical, cognitively inspired approach to enhance both the efficiency and depth of memory-enabled generation.

Abstract

Retrieval-augmented generation (RAG) equips large language models (LLMs) with reliable knowledge memory. To strengthen cross-text associations, recent research integrates graphs and hypergraphs into RAG to capture pairwise and multi-entity relations as structured links. However, their misaligned memory organization necessitates costly, disjointed retrieval. To address these limitations, we propose IGMiRAG, a framework inspired by human intuition-guided reasoning. It constructs a hierarchical heterogeneous hypergraph to align multi-granular knowledge, incorporating deductive pathways to simulate realistic memory structures. During querying, IGMiRAG distills intuitive strategies via a question parser to control mining depth and memory window, and activates instantaneous memories as anchors using dual-focus retrieval. Mirroring human intuition, the framework guides retrieval resource allocation dynamically. Furthermore, we design a bidirectional diffusion algorithm that navigates deductive paths to mine in-depth memories, emulating human reasoning processes. Extensive evaluations indicate IGMiRAG outperforms the state-of-the-art baseline by 4.8% EM and 5.0% F1 overall, with token costs adapting to task complexity (average 6.3k+, minimum 3.0k+). This work presents a cost-effective RAG paradigm that improves both efficiency and effectiveness.

IGMiRAG: Intuition-Guided Retrieval-Augmented Generation with Adaptive Mining of In-Depth Memory

TL;DR

IGMiRAG addresses memory fragmentation in retrieval-augmented generation by introducing a Hierarchical Heterogeneous Hypergraph that unifies multi-granular knowledge and encodes deductive pathways. A Retrieval-Strategy Parser derives intuitive query strategies, while a Dual-Focus Index mitigates cross-type semantic drift during retrieval. Intuitive Anchors Retrieval, via BM25 and DF-Retrieval with Reciprocal Rank Fusion, seeds the memory mining process, which is then carried out by a Preference-Aware Bidirectional Diffusion over the hierarchy with adaptive context windows. Across six benchmarks, IGMiRAG achieves consistent improvements in EM and F1 (average gains of 4.8% and 5.0%, respectively) and exhibits cost-effective token usage, especially on multi-hop tasks, demonstrating a practical, cognitively inspired approach to enhance both the efficiency and depth of memory-enabled generation.

Abstract

Retrieval-augmented generation (RAG) equips large language models (LLMs) with reliable knowledge memory. To strengthen cross-text associations, recent research integrates graphs and hypergraphs into RAG to capture pairwise and multi-entity relations as structured links. However, their misaligned memory organization necessitates costly, disjointed retrieval. To address these limitations, we propose IGMiRAG, a framework inspired by human intuition-guided reasoning. It constructs a hierarchical heterogeneous hypergraph to align multi-granular knowledge, incorporating deductive pathways to simulate realistic memory structures. During querying, IGMiRAG distills intuitive strategies via a question parser to control mining depth and memory window, and activates instantaneous memories as anchors using dual-focus retrieval. Mirroring human intuition, the framework guides retrieval resource allocation dynamically. Furthermore, we design a bidirectional diffusion algorithm that navigates deductive paths to mine in-depth memories, emulating human reasoning processes. Extensive evaluations indicate IGMiRAG outperforms the state-of-the-art baseline by 4.8% EM and 5.0% F1 overall, with token costs adapting to task complexity (average 6.3k+, minimum 3.0k+). This work presents a cost-effective RAG paradigm that improves both efficiency and effectiveness.
Paper Structure (36 sections, 10 equations, 13 figures, 4 tables)

This paper contains 36 sections, 10 equations, 13 figures, 4 tables.

Figures (13)

  • Figure 1: $Stage\;1$ represents the formation of human memory, while $Stage\;2–Stage\;4$ represent intuition-guided reasoning mechanism in human cognition. Upon query encounter, we instantaneously assess and retrieve memory anchors, followed by deeper associative recall.
  • Figure 2: The framework of IGMiRAG.Indexing:A. An LLM-based analyzer extracts multi-granular knowledge memories from each chunk. B Organizing all knowledge into a hierarchical heterogeneous hypergraph (HHHG) persisted in HyperGraph-DB. The semantic descriptions of all units are embedded and indexed as a global--local dual-focus HNSW index (DF-Index), and a separate BM25 corpus is built from name fields. Retrieval:C. An LLM-based Retrieval-Strategy Parser (RSP) distills strategies from user queries by simulating human intuition response. D. Multi-channel recall, combining BM25 string matching with dual-focus vector retrieval, identifies high-quality seed vertices as intuitive memory anchors. E. Preference-aware bidirectional diffusion traverses the HHHG to mine latent, in-depth memories. These units are then aggregated into a context window, the size of which is dynamically scaled according to query complexity, before being fed to the LLM. Two examples of the indexing and Retrieval process are provided in Appendix \ref{['Pipeline_Example']}.
  • Figure 3: Efficiency Comparison. Subfigure $(a)$ shows the token costs (k) comparison across structure-enhanced methods, while subfigure $(b)$ provides a comprehensive comparison of all RAG methods on MuSiQue regarding EM, Avg. Tokens, and Avg. Time.
  • Figure 4: Hyperparameter Sensitivity Analysis. Subfigure ($a$) is the heatmap for different combinations of $k_u$ and $k_c$, while subfigure ($b$) shows the comparison results with different $k_b$.
  • Figure 5: The percentage of different depths on six benchmarks.
  • ...and 8 more figures