Table of Contents
Fetching ...

Knowledge Graph Reasoning with Self-supervised Reinforcement Learning

Ying Ma, Owen Burns, Mingqiu Wang, Gang Li, Nan Du, Laurent El Shafey, Liqiang Wang, Izhak Shafran, Hagen Soltau

TL;DR

The paper tackles large action spaces in knowledge graph reasoning by introducing SSRL, which warms up a policy with self-generated labels and then refines it via reinforcement learning. A BFS-based label generation module increases supervision density and helps mitigate distributional mismatch in static KG environments. Empirically, SSRL improves over strong baselines on four large KG benchmarks, achieving state-of-the-art Hits@k and MRR, and can serve as a plug-in for existing RL architectures like MINERVA and MultiHopKG. The approach offers a practical, interpretable method for scalable KG reasoning with enhanced exploration and more informative reasoning paths.

Abstract

Reinforcement learning (RL) is an effective method of finding reasoning pathways in incomplete knowledge graphs (KGs). To overcome the challenges of a large action space, a self-supervised pre-training method is proposed to warm up the policy network before the RL training stage. To alleviate the distributional mismatch issue in general self-supervised RL (SSRL), in our supervised learning (SL) stage, the agent selects actions based on the policy network and learns from generated labels; this self-generation of labels is the intuition behind the name self-supervised. With this training framework, the information density of our SL objective is increased and the agent is prevented from getting stuck with the early rewarded paths. Our self-supervised RL (SSRL) method improves the performance of RL by pairing it with the wide coverage achieved by SL during pretraining, since the breadth of the SL objective makes it infeasible to train an agent with that alone. We show that our SSRL model meets or exceeds current state-of-the-art results on all Hits@k and mean reciprocal rank (MRR) metrics on four large benchmark KG datasets. This SSRL method can be used as a plug-in for any RL architecture for a KGR task. We adopt two RL architectures, i.e., MINERVA and MultiHopKG as our baseline RL models and experimentally show that our SSRL model consistently outperforms both baselines on all of these four KG reasoning tasks. Full code for the paper available at https://github.com/owenonline/Knowledge-Graph-Reasoning-with-Self-supervised-Reinforcement-Learning.

Knowledge Graph Reasoning with Self-supervised Reinforcement Learning

TL;DR

The paper tackles large action spaces in knowledge graph reasoning by introducing SSRL, which warms up a policy with self-generated labels and then refines it via reinforcement learning. A BFS-based label generation module increases supervision density and helps mitigate distributional mismatch in static KG environments. Empirically, SSRL improves over strong baselines on four large KG benchmarks, achieving state-of-the-art Hits@k and MRR, and can serve as a plug-in for existing RL architectures like MINERVA and MultiHopKG. The approach offers a practical, interpretable method for scalable KG reasoning with enhanced exploration and more informative reasoning paths.

Abstract

Reinforcement learning (RL) is an effective method of finding reasoning pathways in incomplete knowledge graphs (KGs). To overcome the challenges of a large action space, a self-supervised pre-training method is proposed to warm up the policy network before the RL training stage. To alleviate the distributional mismatch issue in general self-supervised RL (SSRL), in our supervised learning (SL) stage, the agent selects actions based on the policy network and learns from generated labels; this self-generation of labels is the intuition behind the name self-supervised. With this training framework, the information density of our SL objective is increased and the agent is prevented from getting stuck with the early rewarded paths. Our self-supervised RL (SSRL) method improves the performance of RL by pairing it with the wide coverage achieved by SL during pretraining, since the breadth of the SL objective makes it infeasible to train an agent with that alone. We show that our SSRL model meets or exceeds current state-of-the-art results on all Hits@k and mean reciprocal rank (MRR) metrics on four large benchmark KG datasets. This SSRL method can be used as a plug-in for any RL architecture for a KGR task. We adopt two RL architectures, i.e., MINERVA and MultiHopKG as our baseline RL models and experimentally show that our SSRL model consistently outperforms both baselines on all of these four KG reasoning tasks. Full code for the paper available at https://github.com/owenonline/Knowledge-Graph-Reasoning-with-Self-supervised-Reinforcement-Learning.
Paper Structure (25 sections, 8 equations, 8 figures, 8 tables)

This paper contains 25 sections, 8 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: System architecture of SSRL
  • Figure 2: seeding label generation process. (a) 3-hop neighborhood of the start entity $e2$; the red dashed line represents the missing link that must be inferred. (b) Step1: remove the link between $e_2$ and $e_5$ and self-loop of all nodes except for the ones in $E_{\mathrm{all}}$; (c) First round of traverse. The nodes visited before are added to the set $\mathcal{M}$. (d) Marking nodes on correct paths as red. (e) and (f) Generating labels for left nodes in $\mathcal{C}$.
  • Figure 3: Reasoning paths discovered by the SSRL and RL-only agents respectively for 2 queries from the test set. A green box around the end entity indicates that the agent found the target entity exactly, a yellow box indicates a close match, and a red box indicates an incorrect entity.
  • Figure 4: Learning curves (accuracy v.s. number of training batches) with different SL pretraining steps followed by RL.
  • Figure 5: MRR evaluation on to-many and to-one queries.
  • ...and 3 more figures