Table of Contents
Fetching ...

Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning

Jinyeop Song, Song Wang, Julian Shun, Yada Zhu

TL;DR

KG-R1 tackles the cost and transferability limitations of modular KG-RAG pipelines by introducing a single-agent, end-to-end RL framework that interacts with a lightweight, schema-agnostic KG retrieval server. Using a $3$B-parameter LLM, KG-R1 achieves competitive KGQA accuracy while reducing generation tokens and inference cost, and it demonstrates plug-and-play cross-KG transfer by swapping the backend KG without retraining. The approach relies on a GRPO-style RL objective that combines per-turn and global rewards, along with group-relative credit assignment, to train the agent to retrieve and reason over KG evidence across turns. This yields a practical KG-RAG solution with strong generalization and deployment potential in knowledge-intensive settings.

Abstract

Knowledge-graph retrieval-augmented generation (KG-RAG) couples large language models (LLMs) with structured, verifiable knowledge graphs (KGs) to reduce hallucinations and expose reasoning traces. However, many KG-RAG systems compose multiple LLM modules (e.g planning, reasoning, and responding), inflating inference cost and binding behavior to a specific target KG. To address this, we introduce KG-R1, an agentic KG retrieval-augmented generation (KG-RAG) framework through reinforcement learning (RL). KG-R1 utilizes a single agent that interacts with KGs as its environment, learning to retrieve at each step and incorporating the retrieved information into its reasoning and generation. The process is optimized through end-to-end RL. In controlled experiments across Knowledge-Graph Question Answering (KGQA) benchmarks, our method demonstrates both efficiency and transferability: Using Qwen-2.5-3B, KG-R1 improves answer accuracy with fewer generation tokens than prior multi-module workflow methods that use larger foundation or fine-tuned models. Furthermore, KG-R1 enables plug and play: after training, it maintains strong accuracy on new KGs without modification. These properties make KG-R1 a promising KG-RAG framework for real-world deployment. Our code is publicly available at https://github.com/Jinyeop3110/KG-R1.

Efficient and Transferable Agentic Knowledge Graph RAG via Reinforcement Learning

TL;DR

KG-R1 tackles the cost and transferability limitations of modular KG-RAG pipelines by introducing a single-agent, end-to-end RL framework that interacts with a lightweight, schema-agnostic KG retrieval server. Using a B-parameter LLM, KG-R1 achieves competitive KGQA accuracy while reducing generation tokens and inference cost, and it demonstrates plug-and-play cross-KG transfer by swapping the backend KG without retraining. The approach relies on a GRPO-style RL objective that combines per-turn and global rewards, along with group-relative credit assignment, to train the agent to retrieve and reason over KG evidence across turns. This yields a practical KG-RAG solution with strong generalization and deployment potential in knowledge-intensive settings.

Abstract

Knowledge-graph retrieval-augmented generation (KG-RAG) couples large language models (LLMs) with structured, verifiable knowledge graphs (KGs) to reduce hallucinations and expose reasoning traces. However, many KG-RAG systems compose multiple LLM modules (e.g planning, reasoning, and responding), inflating inference cost and binding behavior to a specific target KG. To address this, we introduce KG-R1, an agentic KG retrieval-augmented generation (KG-RAG) framework through reinforcement learning (RL). KG-R1 utilizes a single agent that interacts with KGs as its environment, learning to retrieve at each step and incorporating the retrieved information into its reasoning and generation. The process is optimized through end-to-end RL. In controlled experiments across Knowledge-Graph Question Answering (KGQA) benchmarks, our method demonstrates both efficiency and transferability: Using Qwen-2.5-3B, KG-R1 improves answer accuracy with fewer generation tokens than prior multi-module workflow methods that use larger foundation or fine-tuned models. Furthermore, KG-R1 enables plug and play: after training, it maintains strong accuracy on new KGs without modification. These properties make KG-R1 a promising KG-RAG framework for real-world deployment. Our code is publicly available at https://github.com/Jinyeop3110/KG-R1.

Paper Structure

This paper contains 42 sections, 6 theorems, 13 equations, 16 figures, 6 tables, 1 algorithm.

Key Result

Proposition 3.1

For any reasoning path $Z:\ e_0 \xrightarrow{r_1}\cdots\xrightarrow{r_\ell} e_\ell$ in $G$, there exists an action sequence in $\mathcal{U}_{\text{ret}}$ of length at most $\ell{+}1$ whose output includes $e_\ell$.

Figures (16)

  • Figure 1: Overview of KG-R1, a multi-turn agentic framework for KG-RAG trained with reinforcement learning. The framework enables cost-efficient inference and demonstrates strong cross-KG transferability.
  • Figure 2: Prior multi-module methods are costly and do not transfer well across KGs. Left: mean end-to-end generated tokens per query on WebQSP Yih2016WebQSP. Right: average $\mathrm{Hit@1}$ over five out-of-training KGQA datasets (See Sec. \ref{['sec:exp2']}). KG-R1 achieves both low token cost and strong cross-KG transferability.
  • Figure 3: KG-R1 framework: a single LLM agent undergoes multi-turn generation–execution loop with a schema-agnostic KG retrieval server and responds with the final answer.
  • Figure 4: F1 score over KG-R1 training on WebQSP and CWQ for Qwen2.5-3B-it. Training (blue) and validation (red).
  • Figure 8: KG-R1 error types with actual server error messages.
  • ...and 11 more figures

Theorems & Definitions (6)

  • Proposition 3.1: Retrieval Action Set Completeness
  • Proposition 3.2: Schema-Free Transferability
  • Proposition A.1: Finite-Horizon Bound
  • Proposition A.2: Completeness (Integrity)
  • Proposition A.3: Schema-Free Transferability
  • Proposition A.4: Minimality of $\mathcal{U}_{\text{ret}}$