Table of Contents
Fetching ...

GNN2R: Weakly-Supervised Rationale-Providing Question Answering over Knowledge Graphs

Ruijie Wang, Luca Rossetto, Michael Cochez, Abraham Bernstein

TL;DR

KGQA often lacks explainability and efficiency. GNN2R combines a GNN-based coarse reasoning module with a language-model–driven explicit reasoning module to retrieve both final answers and verifiable reasoning subgraphs under weak supervision. Across multiple benchmarks, GNN2R consistently outperforms state-of-the-art baselines in accuracy and subgraph quality, with substantial gains from the Step-II refinement, while remaining efficient. This approach delivers an explainable, scalable QA pipeline for knowledge graphs, with public code and models available.

Abstract

Despite the rapid progress of large language models (LLMs), knowledge graph-based question answering (KGQA) remains essential for producing verifiable and hallucination-resistant answers in many real-world settings where answer trustworthiness and computational efficiency are highly valued. However, most existing KGQA methods provide only final answers in the form of KG entities. Without explicit explanations -- ideally in the form of intermediate reasoning process over relevant KG triples, the QA results are difficult to inspect and interpret. Moreover, this limitation prevents the rich and verifiable knowledge encoded in KGs, which is a key advantage of KGQA over LLMs, from being fully leveraged. However, addressing this issue remains highly challenging due to the lack of annotated intermediate reasoning process and the requirement of high efficiency in KGQA. In this paper, we propose a novel Graph Neural Network-based Two-Step Reasoning method (GNN2R) that can efficiently retrieve both final answers and corresponding reasoning subgraphs as verifiable rationales, using only weak supervision from widely-available final answer annotations. We extensively evaluated GNN2R and demonstrated that GNN2R substantially outperforms existing state-of-the-art KGQA methods in terms of effectiveness, efficiency, and the quality of generated explanations. The complete code and pre-trained models are available at https://github.com/ruijie-wang-uzh/GNN2R.

GNN2R: Weakly-Supervised Rationale-Providing Question Answering over Knowledge Graphs

TL;DR

KGQA often lacks explainability and efficiency. GNN2R combines a GNN-based coarse reasoning module with a language-model–driven explicit reasoning module to retrieve both final answers and verifiable reasoning subgraphs under weak supervision. Across multiple benchmarks, GNN2R consistently outperforms state-of-the-art baselines in accuracy and subgraph quality, with substantial gains from the Step-II refinement, while remaining efficient. This approach delivers an explainable, scalable QA pipeline for knowledge graphs, with public code and models available.

Abstract

Despite the rapid progress of large language models (LLMs), knowledge graph-based question answering (KGQA) remains essential for producing verifiable and hallucination-resistant answers in many real-world settings where answer trustworthiness and computational efficiency are highly valued. However, most existing KGQA methods provide only final answers in the form of KG entities. Without explicit explanations -- ideally in the form of intermediate reasoning process over relevant KG triples, the QA results are difficult to inspect and interpret. Moreover, this limitation prevents the rich and verifiable knowledge encoded in KGs, which is a key advantage of KGQA over LLMs, from being fully leveraged. However, addressing this issue remains highly challenging due to the lack of annotated intermediate reasoning process and the requirement of high efficiency in KGQA. In this paper, we propose a novel Graph Neural Network-based Two-Step Reasoning method (GNN2R) that can efficiently retrieve both final answers and corresponding reasoning subgraphs as verifiable rationales, using only weak supervision from widely-available final answer annotations. We extensively evaluated GNN2R and demonstrated that GNN2R substantially outperforms existing state-of-the-art KGQA methods in terms of effectiveness, efficiency, and the quality of generated explanations. The complete code and pre-trained models are available at https://github.com/ruijie-wang-uzh/GNN2R.
Paper Structure (16 sections, 9 equations, 8 figures, 9 tables, 2 algorithms)

This paper contains 16 sections, 9 equations, 8 figures, 9 tables, 2 algorithms.

Figures (8)

  • Figure 1: A KGQA example, where question-relevant entities and relations are highlighted in red and blue. The reasoning subgraph is distinguished by a light-orange background, featuring two reasoning paths in orange and purple.
  • Figure 2: The GNN-based encoding of the given question and KG in Step-I.
  • Figure 3: An overview of Step-II, assuming Tim Burton is extracted as a candidate answer regarding the example question and KG in \ref{['fig:introduction']}.
  • Figure 4: The visualization of the embedding space after each of the three GNN layers for answering the question "what is the name of the place of birth of offspring of Henriette Adelaide of Savoy's husband?" The non-answer entities in the KG, the general question embedding, and the answer entity are represented as orange crosses, blue stars, and green triangles, respectively.
  • Figure 5: The performance of GNN2R (Hits@1 and F1 in percentage form) on PQL-2hop, PQL-3hop, and WQSP when 25%, 50%, 75%, and 95% training questions are directly removed or randomly shuffled.
  • ...and 3 more figures