HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation
Hao Liu, Zhengren Wang, Xi Chen, Zhiyu Li, Feiyu Xiong, Qinhan Yu, Wentao Zhang
TL;DR
HopRAG introduces a logic-aware retrieval paradigm by building a graph-structured index of passages connected via pseudo-queries. A retrieve-reason-prune pipeline employs reasoning-enabled graph traversal to expand the retrieval scope beyond lexical/semantic similarity, enabling effective multi-hop QA across documents. Empirical results on MuSiQue, 2WikiMultiHopQA, and HotpotQA show that HopRAG improves answer accuracy and retrieval F1 compared to strong baselines, with ablations clarifying the impact of top_k, n_hop, and traversal models. The approach demonstrates how integrating logical relations into retrieval can significantly enhance information-provision quality for knowledge-intensive tasks, with potential for broader-domain scaling and efficiency optimizations.
Abstract
Retrieval-Augmented Generation (RAG) systems often struggle with imperfect retrieval, as traditional retrievers focus on lexical or semantic similarity rather than logical relevance. To address this, we propose \textbf{HopRAG}, a novel RAG framework that augments retrieval with logical reasoning through graph-structured knowledge exploration. During indexing, HopRAG constructs a passage graph, with text chunks as vertices and logical connections established via LLM-generated pseudo-queries as edges. During retrieval, it employs a \textit{retrieve-reason-prune} mechanism: starting with lexically or semantically similar passages, the system explores multi-hop neighbors guided by pseudo-queries and LLM reasoning to identify truly relevant ones. Experiments on multiple multi-hop benchmarks demonstrate that HopRAG's \textit{retrieve-reason-prune} mechanism can expand the retrieval scope based on logical connections and improve final answer quality.
