Table of Contents
Fetching ...

Agentic-KGR: Co-evolutionary Knowledge Graph Construction through Multi-Agent Reinforcement Learning

Jing Li, Zhijie Sun, Zhicheng Zhou, Suming Qiu, Junjie Huang, Haijia Sun, Linyuan Qiu

TL;DR

This work tackles the limitations of static knowledge bases in knowledge-enhanced LLMs by introducing Agentic-KGR, a co-evolution framework where language models and knowledge graphs evolve through multi-round reinforcement learning. It integrates three core innovations: dynamic ontology expansion to grow the KG structure during training, retrieval-augmented memory to support bidirectional adaptation between neural representations and knowledge structures, and a learnable multi-scale prompt compression that preserves essential information while reducing computation. The approach formalizes a co-evolution operator and a dual reward with adaptive mixing, coupled with a GraphRAG readout and a KG update mechanism to ground reasoning in evolving knowledge. Empirically, Agentic-KGR yields substantial improvements in KG extraction and downstream QA, with notable gains when the dynamically built KGs are used within GraphRAG, underscoring the potential of self-improving, domain-adaptive knowledge systems.

Abstract

Current knowledge-enhanced large language models (LLMs) rely on static, pre-constructed knowledge bases that suffer from coverage gaps and temporal obsolescence, limiting their effectiveness in dynamic information environments. We present Agentic-KGR, a novel framework enabling co-evolution between LLMs and knowledge graphs (KGs) through multi-round reinforcement learning (RL). Our approach introduces three key innovations: (1) a dynamic schema expansion mechanism that systematically extends graph ontologies beyond pre-defined boundaries during training; (2) a retrieval-augmented memory system enabling synergistic co-evolution between model parameters and knowledge structures through continuous optimization; (3) a learnable multi-scale prompt compression approach that preserves critical information while reducing computational complexity through adaptive sequence optimization. Experimental results demonstrate substantial improvements over supervised baselines and single-round RL approaches in knowledge extraction tasks. When integrated with GraphRAG, our method achieves superior performance in downstream QA tasks, with significant gains in both accuracy and knowledge coverage compared to existing methods.

Agentic-KGR: Co-evolutionary Knowledge Graph Construction through Multi-Agent Reinforcement Learning

TL;DR

This work tackles the limitations of static knowledge bases in knowledge-enhanced LLMs by introducing Agentic-KGR, a co-evolution framework where language models and knowledge graphs evolve through multi-round reinforcement learning. It integrates three core innovations: dynamic ontology expansion to grow the KG structure during training, retrieval-augmented memory to support bidirectional adaptation between neural representations and knowledge structures, and a learnable multi-scale prompt compression that preserves essential information while reducing computation. The approach formalizes a co-evolution operator and a dual reward with adaptive mixing, coupled with a GraphRAG readout and a KG update mechanism to ground reasoning in evolving knowledge. Empirically, Agentic-KGR yields substantial improvements in KG extraction and downstream QA, with notable gains when the dynamically built KGs are used within GraphRAG, underscoring the potential of self-improving, domain-adaptive knowledge systems.

Abstract

Current knowledge-enhanced large language models (LLMs) rely on static, pre-constructed knowledge bases that suffer from coverage gaps and temporal obsolescence, limiting their effectiveness in dynamic information environments. We present Agentic-KGR, a novel framework enabling co-evolution between LLMs and knowledge graphs (KGs) through multi-round reinforcement learning (RL). Our approach introduces three key innovations: (1) a dynamic schema expansion mechanism that systematically extends graph ontologies beyond pre-defined boundaries during training; (2) a retrieval-augmented memory system enabling synergistic co-evolution between model parameters and knowledge structures through continuous optimization; (3) a learnable multi-scale prompt compression approach that preserves critical information while reducing computational complexity through adaptive sequence optimization. Experimental results demonstrate substantial improvements over supervised baselines and single-round RL approaches in knowledge extraction tasks. When integrated with GraphRAG, our method achieves superior performance in downstream QA tasks, with significant gains in both accuracy and knowledge coverage compared to existing methods.

Paper Structure

This paper contains 48 sections, 5 theorems, 46 equations, 8 figures, 2 tables.

Key Result

Theorem A.1

Let $\pi_H(a\mid s)\equiv \pi_\theta(a\mid \mathbf{H}(s))$ and $\pi_Z(a\mid s)\equiv \pi_\theta(a\mid \mathbf{Z}(s))$ with $\mathbf{Z}=\phi(\mathbf{H})$. Under $\gamma\in(0,1)$, the performance gap satisfies: where $\varepsilon_{\mathrm{obs}} \triangleq \mathbb{E}_s \|\mathbf{H}(s)-\psi(\phi(\mathbf{H}(s)))\|_2 \le \varepsilon_{\mathrm{rec}}$ and $\epsilon_\pi \triangleq \mathbb{E}_s \mathrm{KL}\

Figures (8)

  • Figure 1: Multi-round interactive knowledge discovery in product QA scenario
  • Figure 2: Overall Architecture of Agentic-KGR Framework.
  • Figure 3: Training reward variation for RL and Agentic-KGR methods across training steps.
  • Figure 4: Response length evolution during RL and Agentic-KGR training across different model scales.
  • Figure 5: Graph density analysis and tool invocation frequency distribution.
  • ...and 3 more figures

Theorems & Definitions (20)

  • Definition 1: Differentiable Subgraph Retrieval Distribution
  • Definition 2: GraphRAG Readout Operator
  • Definition 3: KG Update Operator
  • Definition 4: Environmental Reward Components
  • Definition 5: Learnable Multi-Scale Compression
  • Theorem A.1: Performance Degradation Bound under Compression
  • Remark A.1
  • proof : Proof of Theorem \ref{['thm:compression_bound']}
  • Theorem A.2: Policy Improvement with Trust Region under Compression
  • Remark A.2
  • ...and 10 more