ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation
Shu Wang, Yixiang Fang, Yingli Zhou, Xilin Liu, Yuchi Ma
TL;DR
ArchRAG introduces a graph-based RAG framework that leverages attributed communities (ACs) and a novel hierarchical index (C-HNSW) to address low community quality, single-granularity retrieval, and high token costs in graph-based RAG. The offline phase builds a knowledge graph from corpora, applies an LLM-driven hierarchical clustering to form ACs, and constructs a multi-layer C-HNSW index for efficient retrieval. The online phase performs hierarchical search across levels and uses adaptive filtering-based generation to integrate retrieved content, enabling effective handling of both abstract and specific QA tasks. Empirical results show ArchRAG achieves state-of-the-art accuracy on several QA benchmarks while dramatically reducing token usage, demonstrating practical benefits in efficiency and reliability for graph-based RAG systems.
Abstract
Retrieval-Augmented Generation (RAG) has proven effective in integrating external knowledge into large language models (LLMs) for solving question-answer (QA) tasks. The state-of-the-art RAG approaches often use the graph data as the external data since they capture the rich semantic information and link relationships between entities. However, existing graph-based RAG approaches cannot accurately identify the relevant information from the graph and also consume large numbers of tokens in the online retrieval process. To address these issues, we introduce a novel graph-based RAG approach, called Attributed Community-based Hierarchical RAG (ArchRAG), by augmenting the question using attributed communities, and also introducing a novel LLM-based hierarchical clustering method. To retrieve the most relevant information from the graph for the question, we build a novel hierarchical index structure for the attributed communities and develop an effective online retrieval method. Experimental results demonstrate that ArchRAG outperforms existing methods in both accuracy and token cost.
