Table of Contents
Fetching ...

Cog-RAG: Cognitive-Inspired Dual-Hypergraph with Theme Alignment Retrieval-Augmented Generation

Hao Hu, Yifan Feng, Ruoxue Li, Rundong Xue, Xingliang Hou, Zhiqiang Tian, Yue Gao, Shaoyi Du

TL;DR

RAG systems struggle with fragmented content when retrieval is limited to flat, chunk-level matching and often miss global thematic coherence and high-order inter-entity relations. Cog-RAG introduces a dual-hypergraph architecture comprising a theme hypergraph for macro, global structure and an entity hypergraph for micro, high-order relations, coupled with a cognitive-inspired two-stage retrieval that first activates theme-based context and then recalls entity-level details. Empirical results across five diverse datasets and multiple LLMs show Cog-RAG outperforms state-of-the-art baselines, with ablations confirming the contributions of the theme/entity graphs and the two-stage retrieval. By enabling top-down thematic alignment from global themes to local details, Cog-RAG enhances semantic coherence and reduces information gaps in knowledge-intensive generation tasks.

Abstract

Retrieval-Augmented Generation (RAG) enhances the response quality and domain-specific performance of large language models (LLMs) by incorporating external knowledge to combat hallucinations. In recent research, graph structures have been integrated into RAG to enhance the capture of semantic relations between entities. However, it primarily focuses on low-order pairwise entity relations, limiting the high-order associations among multiple entities. Hypergraph-enhanced approaches address this limitation by modeling multi-entity interactions via hyperedges, but they are typically constrained to inter-chunk entity-level representations, overlooking the global thematic organization and alignment across chunks. Drawing inspiration from the top-down cognitive process of human reasoning, we propose a theme-aligned dual-hypergraph RAG framework (Cog-RAG) that uses a theme hypergraph to capture inter-chunk thematic structure and an entity hypergraph to model high-order semantic relations. Furthermore, we design a cognitive-inspired two-stage retrieval strategy that first activates query-relevant thematic content from the theme hypergraph, and then guides fine-grained recall and diffusion in the entity hypergraph, achieving semantic alignment and consistent generation from global themes to local details. Our extensive experiments demonstrate that Cog-RAG significantly outperforms existing state-of-the-art baseline approaches.

Cog-RAG: Cognitive-Inspired Dual-Hypergraph with Theme Alignment Retrieval-Augmented Generation

TL;DR

RAG systems struggle with fragmented content when retrieval is limited to flat, chunk-level matching and often miss global thematic coherence and high-order inter-entity relations. Cog-RAG introduces a dual-hypergraph architecture comprising a theme hypergraph for macro, global structure and an entity hypergraph for micro, high-order relations, coupled with a cognitive-inspired two-stage retrieval that first activates theme-based context and then recalls entity-level details. Empirical results across five diverse datasets and multiple LLMs show Cog-RAG outperforms state-of-the-art baselines, with ablations confirming the contributions of the theme/entity graphs and the two-stage retrieval. By enabling top-down thematic alignment from global themes to local details, Cog-RAG enhances semantic coherence and reduces information gaps in knowledge-intensive generation tasks.

Abstract

Retrieval-Augmented Generation (RAG) enhances the response quality and domain-specific performance of large language models (LLMs) by incorporating external knowledge to combat hallucinations. In recent research, graph structures have been integrated into RAG to enhance the capture of semantic relations between entities. However, it primarily focuses on low-order pairwise entity relations, limiting the high-order associations among multiple entities. Hypergraph-enhanced approaches address this limitation by modeling multi-entity interactions via hyperedges, but they are typically constrained to inter-chunk entity-level representations, overlooking the global thematic organization and alignment across chunks. Drawing inspiration from the top-down cognitive process of human reasoning, we propose a theme-aligned dual-hypergraph RAG framework (Cog-RAG) that uses a theme hypergraph to capture inter-chunk thematic structure and an entity hypergraph to model high-order semantic relations. Furthermore, we design a cognitive-inspired two-stage retrieval strategy that first activates query-relevant thematic content from the theme hypergraph, and then guides fine-grained recall and diffusion in the entity hypergraph, achieving semantic alignment and consistent generation from global themes to local details. Our extensive experiments demonstrate that Cog-RAG significantly outperforms existing state-of-the-art baseline approaches.

Paper Structure

This paper contains 40 sections, 16 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Knowledge modeling of graph, hypergraph, and our theme-enhanced RAG.
  • Figure 2: The overall framework of Cog-RAG.
  • Figure 3: Test results by scoring. (a) is the comparison results on five datasets; (b) is the results of the neurology dataset on six dimensions; (c) shows the evaluation results on different LLMs.
  • Figure 4: Entity Hypergraph Visualization.
  • Figure 5: Comparison results on different metrics.
  • ...and 1 more figures