ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

Jinhao Jiang; Kun Zhou; Wayne Xin Zhao; Yaliang Li; Ji-Rong Wen

ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yaliang Li, Ji-Rong Wen

TL;DR

ReasoningLM presents a unified PLM that performs both natural-language understanding and structural subgraph reasoning for KGQA by embedding a GNN-like propagation within a Transformer. A BFS-based subgraph serialization combined with a constrained, subgraph-aware self-attention mechanism enables tight question–subgraph interaction inside a single model. The approach is complemented by adaptation tuning on 20k synthesized subgraphs and parameter-efficient fine-tuning with adapters, yielding state-of-the-art results with far fewer updated parameters and less training data. Across WebQSP, CWQ, and MetaQA, ReasoningLM significantly outperforms baselines, demonstrating the practicality of integrating graph structure reasoning directly into PLMs for KGQA.

Abstract

Question Answering over Knowledge Graph (KGQA) aims to seek answer entities for the natural language question from a large-scale Knowledge Graph~(KG). To better perform reasoning on KG, recent work typically adopts a pre-trained language model~(PLM) to model the question, and a graph neural network~(GNN) based module to perform multi-hop reasoning on the KG. Despite the effectiveness, due to the divergence in model architecture, the PLM and GNN are not closely integrated, limiting the knowledge sharing and fine-grained feature interactions. To solve it, we aim to simplify the above two-module approach, and develop a more capable PLM that can directly support subgraph reasoning for KGQA, namely ReasoningLM. In our approach, we propose a subgraph-aware self-attention mechanism to imitate the GNN for performing structured reasoning, and also adopt an adaptation tuning strategy to adapt the model parameters with 20,000 subgraphs with synthesized questions. After adaptation, the PLM can be parameter-efficient fine-tuned on downstream tasks. Experiments show that ReasoningLM surpasses state-of-the-art models by a large margin, even with fewer updated parameters and less training data. Our codes and data are publicly available at~\url{https://github.com/RUCAIBox/ReasoningLM}.

ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

TL;DR

Abstract

Paper Structure (23 sections, 6 equations, 2 figures, 7 tables)

This paper contains 23 sections, 6 equations, 2 figures, 7 tables.

Introduction
Related Work
Preliminary
Approach
Overview
Adapting PLM for Subgraph Reasoning
BFS-based Subgraph Serialization
Subgraph-aware Self-Attention
Adaptation Tuning
Tuning Data Construction
Answer Entity Prediction
Efficient Fine-tuning
Experiments
Experimental Setup
Implementation Details
...and 8 more sections

Figures (2)

Figure 1: The illustration of performing answer entity reasoning over a subgraph according to the question using ReasoningLM with our proposed subgraph serialization and subgraph-aware self-attention.
Figure 2: The Hits@1 scores of our ReasoningLM on WebQSP and CWQ after adaptation tuning with a various number of samples (Left). And the Hits@1 score of our ReasoningLM compared with two strong baselines (i.e., NSM and UniKGQA) on CWQ when fine-tuning with various numbers of samples (Right)

ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

TL;DR

Abstract

ReasoningLM: Enabling Structural Subgraph Reasoning in Pre-trained Language Models for Question Answering over Knowledge Graph

Authors

TL;DR

Abstract

Table of Contents

Figures (2)