Cog-RAG: Cognitive-Inspired Dual-Hypergraph with Theme Alignment Retrieval-Augmented Generation
Hao Hu, Yifan Feng, Ruoxue Li, Rundong Xue, Xingliang Hou, Zhiqiang Tian, Yue Gao, Shaoyi Du
TL;DR
RAG systems struggle with fragmented content when retrieval is limited to flat, chunk-level matching and often miss global thematic coherence and high-order inter-entity relations. Cog-RAG introduces a dual-hypergraph architecture comprising a theme hypergraph for macro, global structure and an entity hypergraph for micro, high-order relations, coupled with a cognitive-inspired two-stage retrieval that first activates theme-based context and then recalls entity-level details. Empirical results across five diverse datasets and multiple LLMs show Cog-RAG outperforms state-of-the-art baselines, with ablations confirming the contributions of the theme/entity graphs and the two-stage retrieval. By enabling top-down thematic alignment from global themes to local details, Cog-RAG enhances semantic coherence and reduces information gaps in knowledge-intensive generation tasks.
Abstract
Retrieval-Augmented Generation (RAG) enhances the response quality and domain-specific performance of large language models (LLMs) by incorporating external knowledge to combat hallucinations. In recent research, graph structures have been integrated into RAG to enhance the capture of semantic relations between entities. However, it primarily focuses on low-order pairwise entity relations, limiting the high-order associations among multiple entities. Hypergraph-enhanced approaches address this limitation by modeling multi-entity interactions via hyperedges, but they are typically constrained to inter-chunk entity-level representations, overlooking the global thematic organization and alignment across chunks. Drawing inspiration from the top-down cognitive process of human reasoning, we propose a theme-aligned dual-hypergraph RAG framework (Cog-RAG) that uses a theme hypergraph to capture inter-chunk thematic structure and an entity hypergraph to model high-order semantic relations. Furthermore, we design a cognitive-inspired two-stage retrieval strategy that first activates query-relevant thematic content from the theme hypergraph, and then guides fine-grained recall and diffusion in the entity hypergraph, achieving semantic alignment and consistent generation from global themes to local details. Our extensive experiments demonstrate that Cog-RAG significantly outperforms existing state-of-the-art baseline approaches.
