Table of Contents
Fetching ...

Beyond Completion: A Foundation Model for General Knowledge Graph Reasoning

Yin Hua, Zhiqiang Liu, Mingyang Chen, Zheng Fang, Chi Man Wong, Lingxiao Li, Chi Man Vong, Huajun Chen, Wen Zhang

TL;DR

MERRY presents a unified foundation model for general KG reasoning by bridging textual and structural modalities through a multi-perspective CMP encoding framework and dynamic fusion mechanisms. It leverages a query-conditioned relation graph and a global textual encoding to produce principled, task-adaptive representations, augmented by a flexible edge scoring strategy. The approach achieves strong zero-shot generalization on inductive KGC and superior performance on KGQA across 28 datasets, demonstrating robust cross-task transfer and practical scalability. By decoupling offline LLM text encoding from online CMP graph updates, MERRY achieves favorable efficiency and scalability while maintaining high reasoning capability, though challenges remain in CMP depth, data completeness, and large-scale graphs.

Abstract

In natural language processing (NLP) and computer vision (CV), the successful application of foundation models across diverse tasks has demonstrated their remarkable potential. However, despite the rich structural and textual information embedded in knowledge graphs (KGs), existing research of foundation model for KG has primarily focused on their structural aspects, with most efforts restricted to in-KG tasks (e.g., knowledge graph completion, KGC). This limitation has hindered progress in addressing more challenging out-of-KG tasks. In this paper, we introduce MERRY, a foundation model for general knowledge graph reasoning, and investigate its performance across two task categories: in-KG reasoning tasks (e.g., KGC) and out-of-KG tasks (e.g., KG question answering, KGQA). We not only utilize the structural information, but also the textual information in KGs. Specifically, we propose a multi-perspective Conditional Message Passing (CMP) encoding architecture to bridge the gap between textual and structural modalities, enabling their seamless integration. Additionally, we introduce a dynamic residual fusion module to selectively retain relevant textual information and a flexible edge scoring mechanism to adapt to diverse downstream tasks. Comprehensive evaluations on 28 datasets demonstrate that MERRY outperforms existing baselines in most scenarios, showcasing strong reasoning capabilities within KGs and excellent generalization to out-of-KG tasks such as KGQA.

Beyond Completion: A Foundation Model for General Knowledge Graph Reasoning

TL;DR

MERRY presents a unified foundation model for general KG reasoning by bridging textual and structural modalities through a multi-perspective CMP encoding framework and dynamic fusion mechanisms. It leverages a query-conditioned relation graph and a global textual encoding to produce principled, task-adaptive representations, augmented by a flexible edge scoring strategy. The approach achieves strong zero-shot generalization on inductive KGC and superior performance on KGQA across 28 datasets, demonstrating robust cross-task transfer and practical scalability. By decoupling offline LLM text encoding from online CMP graph updates, MERRY achieves favorable efficiency and scalability while maintaining high reasoning capability, though challenges remain in CMP depth, data completeness, and large-scale graphs.

Abstract

In natural language processing (NLP) and computer vision (CV), the successful application of foundation models across diverse tasks has demonstrated their remarkable potential. However, despite the rich structural and textual information embedded in knowledge graphs (KGs), existing research of foundation model for KG has primarily focused on their structural aspects, with most efforts restricted to in-KG tasks (e.g., knowledge graph completion, KGC). This limitation has hindered progress in addressing more challenging out-of-KG tasks. In this paper, we introduce MERRY, a foundation model for general knowledge graph reasoning, and investigate its performance across two task categories: in-KG reasoning tasks (e.g., KGC) and out-of-KG tasks (e.g., KG question answering, KGQA). We not only utilize the structural information, but also the textual information in KGs. Specifically, we propose a multi-perspective Conditional Message Passing (CMP) encoding architecture to bridge the gap between textual and structural modalities, enabling their seamless integration. Additionally, we introduce a dynamic residual fusion module to selectively retain relevant textual information and a flexible edge scoring mechanism to adapt to diverse downstream tasks. Comprehensive evaluations on 28 datasets demonstrate that MERRY outperforms existing baselines in most scenarios, showcasing strong reasoning capabilities within KGs and excellent generalization to out-of-KG tasks such as KGQA.

Paper Structure

This paper contains 44 sections, 15 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Overview of the MERRY Framework. (A) All tasks, including KGC and KGQA, are unified under a standardized query representation. (B) The data processing pipeline comprises two main components: (1) relation graph construction to model meta-relations, and (2) edge scoring to assign task-specific weights to edges. (C) The MERRY architecture processes these graphs through QCMP, GCMP, and a multi-perspective dynamic fusion module. In the decoder, the query node is represented as the $Query$ embedding, while candidate nodes serve as $Key$ embeddings, outputting a probability distribution over all candidates.
  • Figure 2: Ablation study results.
  • Figure 3: Performance of different GCMP layers in KGC and different numbers of shots in KGQA.