Table of Contents
Fetching ...

HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation

Hao Liu, Zhengren Wang, Xi Chen, Zhiyu Li, Feiyu Xiong, Qinhan Yu, Wentao Zhang

TL;DR

HopRAG introduces a logic-aware retrieval paradigm by building a graph-structured index of passages connected via pseudo-queries. A retrieve-reason-prune pipeline employs reasoning-enabled graph traversal to expand the retrieval scope beyond lexical/semantic similarity, enabling effective multi-hop QA across documents. Empirical results on MuSiQue, 2WikiMultiHopQA, and HotpotQA show that HopRAG improves answer accuracy and retrieval F1 compared to strong baselines, with ablations clarifying the impact of top_k, n_hop, and traversal models. The approach demonstrates how integrating logical relations into retrieval can significantly enhance information-provision quality for knowledge-intensive tasks, with potential for broader-domain scaling and efficiency optimizations.

Abstract

Retrieval-Augmented Generation (RAG) systems often struggle with imperfect retrieval, as traditional retrievers focus on lexical or semantic similarity rather than logical relevance. To address this, we propose \textbf{HopRAG}, a novel RAG framework that augments retrieval with logical reasoning through graph-structured knowledge exploration. During indexing, HopRAG constructs a passage graph, with text chunks as vertices and logical connections established via LLM-generated pseudo-queries as edges. During retrieval, it employs a \textit{retrieve-reason-prune} mechanism: starting with lexically or semantically similar passages, the system explores multi-hop neighbors guided by pseudo-queries and LLM reasoning to identify truly relevant ones. Experiments on multiple multi-hop benchmarks demonstrate that HopRAG's \textit{retrieve-reason-prune} mechanism can expand the retrieval scope based on logical connections and improve final answer quality.

HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation

TL;DR

HopRAG introduces a logic-aware retrieval paradigm by building a graph-structured index of passages connected via pseudo-queries. A retrieve-reason-prune pipeline employs reasoning-enabled graph traversal to expand the retrieval scope beyond lexical/semantic similarity, enabling effective multi-hop QA across documents. Empirical results on MuSiQue, 2WikiMultiHopQA, and HotpotQA show that HopRAG improves answer accuracy and retrieval F1 compared to strong baselines, with ablations clarifying the impact of top_k, n_hop, and traversal models. The approach demonstrates how integrating logical relations into retrieval can significantly enhance information-provision quality for knowledge-intensive tasks, with potential for broader-domain scaling and efficiency optimizations.

Abstract

Retrieval-Augmented Generation (RAG) systems often struggle with imperfect retrieval, as traditional retrievers focus on lexical or semantic similarity rather than logical relevance. To address this, we propose \textbf{HopRAG}, a novel RAG framework that augments retrieval with logical reasoning through graph-structured knowledge exploration. During indexing, HopRAG constructs a passage graph, with text chunks as vertices and logical connections established via LLM-generated pseudo-queries as edges. During retrieval, it employs a \textit{retrieve-reason-prune} mechanism: starting with lexically or semantically similar passages, the system explores multi-hop neighbors guided by pseudo-queries and LLM reasoning to identify truly relevant ones. Experiments on multiple multi-hop benchmarks demonstrate that HopRAG's \textit{retrieve-reason-prune} mechanism can expand the retrieval scope based on logical connections and improve final answer quality.

Paper Structure

This paper contains 37 sections, 8 equations, 8 figures, 8 tables, 1 algorithm.

Figures (8)

  • Figure 1: (a) Precision, recall and F1 score of BGE dense retrievers on MuSiQue, 2WikiMultiHopQA and HotpotQA with different $top_k$ parameters, revealing the severe imperfect retrieval phenomenon. The highest recall reaches saturation at 0.45 in our settings. (b) We categorize retrieved passages into relevant, indirectly relevant and irrelevant according to the logical relevance to the query. The relevant passages are exactly the supporting facts, and indirectly relevant passages can hop to the supporting facts via HopRAG while irrelevant passages cannot. A large proportion of retrieved passages are indirectly relevant.
  • Figure 2: Demonstration of hopping between passages. For the user query, BGE dense retriever can only return one of the three supporting facts within $top_k$ budget. However, lexically or semantically similar passages complement each other. Hopping between passages, by questions as pathways, improves the retrieval accuracy and completeness.
  • Figure 3: The workflow of HopRAG. Left: At indexing time, we first utilize Query Simulation to generate pseudo-queries for each passage and then apply Edge Merging to connect passages with directed logical edges. Right: At retrieval time, we employ a Retrieve-Reason-Prune pipeline. We first retrieve through purely similarity-based retrieval, then run reasoning-augmented graph traversal to explore the neighborhood, and finally prune the search by a novel metric Helpfulness considering both textual similarity and logical importance.
  • Figure 4: Prompt for generating in-coming questions.
  • Figure 5: Prompt for generating out-coming questions.
  • ...and 3 more figures