Table of Contents
Fetching ...

Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning

Haoran Luo, Haihong E, Guanting Chen, Qika Lin, Yikai Guo, Fangzhi Xu, Zemin Kuang, Meina Song, Xiaobao Wu, Yifan Zhu, Luu Anh Tuan

TL;DR

Graph-R1 introduces an agentic GraphRAG framework trained end-to-end with reinforcement learning to perform multi-turn reasoning over a lightweight knowledge hypergraph. By integrating lightweight knowledge construction, dual-path graph retrieval, and an outcome-directed reward, the approach tightly couples structured knowledge with language generation. Empirical results across six FlashRAG benchmarks show improved reasoning accuracy, retrieval efficiency, and generation quality, with strong generalization to out-of-domain settings. The work highlights the value of end-to-end RL in aligning graph-aware retrieval with coherent, faithful Answers, and outlines future paths for efficiency and multi-modal extensions.

Abstract

Retrieval-Augmented Generation (RAG) mitigates hallucination in LLMs by incorporating external knowledge, but relies on chunk-based retrieval that lacks structural semantics. GraphRAG methods improve RAG by modeling knowledge as entity-relation graphs, but still face challenges in high construction cost, fixed one-time retrieval, and reliance on long-context reasoning and prompt design. To address these challenges, we propose Graph-R1, an agentic GraphRAG framework via end-to-end reinforcement learning (RL). It introduces lightweight knowledge hypergraph construction, models retrieval as a multi-turn agent-environment interaction, and optimizes the agent process via an end-to-end reward mechanism. Experiments on standard RAG datasets show that Graph-R1 outperforms traditional GraphRAG and RL-enhanced RAG methods in reasoning accuracy, retrieval efficiency, and generation quality.

Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning

TL;DR

Graph-R1 introduces an agentic GraphRAG framework trained end-to-end with reinforcement learning to perform multi-turn reasoning over a lightweight knowledge hypergraph. By integrating lightweight knowledge construction, dual-path graph retrieval, and an outcome-directed reward, the approach tightly couples structured knowledge with language generation. Empirical results across six FlashRAG benchmarks show improved reasoning accuracy, retrieval efficiency, and generation quality, with strong generalization to out-of-domain settings. The work highlights the value of end-to-end RL in aligning graph-aware retrieval with coherent, faithful Answers, and outlines future paths for efficiency and multi-modal extensions.

Abstract

Retrieval-Augmented Generation (RAG) mitigates hallucination in LLMs by incorporating external knowledge, but relies on chunk-based retrieval that lacks structural semantics. GraphRAG methods improve RAG by modeling knowledge as entity-relation graphs, but still face challenges in high construction cost, fixed one-time retrieval, and reliance on long-context reasoning and prompt design. To address these challenges, we propose Graph-R1, an agentic GraphRAG framework via end-to-end reinforcement learning (RL). It introduces lightweight knowledge hypergraph construction, models retrieval as a multi-turn agent-environment interaction, and optimizes the agent process via an end-to-end reward mechanism. Experiments on standard RAG datasets show that Graph-R1 outperforms traditional GraphRAG and RL-enhanced RAG methods in reasoning accuracy, retrieval efficiency, and generation quality.

Paper Structure

This paper contains 31 sections, 36 equations, 11 figures, 5 tables, 1 algorithm.

Figures (11)

  • Figure 1: An illustration of Graph-R1.
  • Figure 2: Comparison of F1 scores across RAG benchmarks. Using a graph as the knowledge environment enables RL to achieve a higher performance ceiling compared to chunk-based knowledge.
  • Figure 3: Overview of the Graph-R1 framework: an RL-enhanced reasoning trajectory over knowledge hypergraph, where the agent iteratively decides to think, query, retrieve knowledge, and answer.
  • Figure 4: Step-wise F1 score on HotpotQA based on Qwen2.5 (1.5B, 3B, 7B), where Graph-R1 outperforms baselines and GPT-4o-mini variants (NaiveGeneration, StandardRAG, HyperGraphRAG).
  • Figure 5: (a) Ablation study of Graph-R1. (b-f) Performance comparison across different kinds of knowledge representations, RAG datasets, model parameters, Qwen versions, and RL algorithms.
  • ...and 6 more figures

Theorems & Definitions (6)

  • proof
  • proof
  • proof
  • proof
  • proof
  • proof