Towards Foundation Models for Knowledge Graph Reasoning
Mikhail Galkin, Xinyu Yuan, Hesham Mostafa, Jian Tang, Zhaocheng Zhu
TL;DR
This work tackles the challenge of transferring knowledge graph reasoning across graphs with arbitrary entity and relation vocabularies by proposing ULTRA, a foundation-model-like approach. ULTRA constructs a relation graph and learns conditional relation representations that hinge on interaction patterns, enabling zero-shot generalization to unseen graphs and fine-tuning for downstream tasks. Empirical results on 57 diverse KGs show that a single pre-trained ULTRA model often matches or exceeds state-of-the-art supervised baselines in zero-shot inference, with fine-tuning delivering further gains (average improvements around 10% in $MRR$). The approach demonstrates the potential of transferable, feature-free graph representations for KG reasoning and paves the way for more scalable, cross-graph knowledge integration in domains ranging from biology to culture.
Abstract
Foundation models in language and vision have the ability to run inference on any textual and visual inputs thanks to the transferable representations such as a vocabulary of tokens in language. Knowledge graphs (KGs) have different entity and relation vocabularies that generally do not overlap. The key challenge of designing foundation models on KGs is to learn such transferable representations that enable inference on any graph with arbitrary entity and relation vocabularies. In this work, we make a step towards such foundation models and present ULTRA, an approach for learning universal and transferable graph representations. ULTRA builds relational representations as a function conditioned on their interactions. Such a conditioning strategy allows a pre-trained ULTRA model to inductively generalize to any unseen KG with any relation vocabulary and to be fine-tuned on any graph. Conducting link prediction experiments on 57 different KGs, we find that the zero-shot inductive inference performance of a single pre-trained ULTRA model on unseen graphs of various sizes is often on par or better than strong baselines trained on specific graphs. Fine-tuning further boosts the performance.
