UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph
Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen
TL;DR
This paper tackles multi-hop KGQA by unifying retrieval and reasoning within a single architecture that leverages a PLM-based semantic matching module and a matching information propagation component. It introduces abstract subgraphs to bridge retrieval and reasoning and designs a two-stage training regime—contrastive pre-training for question–relation matching followed by retrieval and reasoning fine-tuning with parameter transfer. Empirical results on MetaQA, WebQSP, and CWQ show strong gains over state-of-the-art baselines, especially on WebQSP and CWQ, with ablations confirming the value of pre-training and cross-stage initialization. The approach offers a practical, efficient pathway for integrated KGQA systems with publicly available code.
Abstract
Multi-hop Question Answering over Knowledge Graph~(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question on a large-scale Knowledge Graph (KG). To cope with the vast search space, existing work usually adopts a two-stage approach: it first retrieves a relatively small subgraph related to the question and then performs the reasoning on the subgraph to find the answer entities accurately. Although these two stages are highly related, previous work employs very different technical solutions for developing the retrieval and reasoning models, neglecting their relatedness in task essence. In this paper, we propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning. For model architecture, UniKGQA consists of a semantic matching module based on a pre-trained language model~(PLM) for question-relation semantic matching, and a matching information propagation module to propagate the matching information along the directed edges on KGs. For parameter learning, we design a shared pre-training task based on question-relation matching for both retrieval and reasoning models, and then propose retrieval- and reasoning-oriented fine-tuning strategies. Compared with previous studies, our approach is more unified, tightly relating the retrieval and reasoning stages. Extensive experiments on three benchmark datasets have demonstrated the effectiveness of our method on the multi-hop KGQA task. Our codes and data are publicly available at~\url{https://github.com/RUCAIBox/UniKGQA}.
