OpenGraph: Towards Open Graph Foundation Models
Lianghao Xia, Ben Kao, Chao Huang
TL;DR
This work tackles zero-shot graph learning by introducing OpenGraph, a framework that combines a unified graph tokenizer, a scalable graph transformer, and LLM-based data augmentation to generalize across unseen graphs. The tokenizer uses smoothed high-order adjacency and topology-aware projection to convert arbitrary graphs into fixed-size token sequences, while the transformer employs token sampling and anchor-based self-attention for scalability. LLM-driven node and edge generation, together with graph topology injection, pre-train the model on diverse synthetic graphs to improve cross-domain generalization. Evaluations on eight real-world datasets demonstrate strong zero-shot performance, with analyses highlighting the importance of the tokenizer design, pre-training data, and sampling strategies, and acknowledging limitations around heterogeneity and explainability.
Abstract
Graph learning has become essential in various domains, including recommendation systems and social network analysis. Graph Neural Networks (GNNs) have emerged as promising techniques for encoding structural information and improving performance in tasks like link prediction and node classification. However, a key challenge remains: the difficulty of generalizing to unseen graph data with different properties. In this work, we propose a novel graph foundation model, called OpenGraph, to address this challenge. Our approach tackles several technical obstacles. Firstly, we enhance data augmentation using a large language model (LLM) to overcome data scarcity in real-world scenarios. Secondly, we introduce a unified graph tokenizer that enables the model to generalize effectively to diverse graph data, even when encountering unseen properties during training. Thirdly, our developed scalable graph transformer captures node-wise dependencies within the global topological context. Extensive experiments validate the effectiveness of our framework. By adapting OpenGraph to new graph characteristics and comprehending diverse graphs, our approach achieves remarkable zero-shot graph learning performance across various settings. We release the model implementation at https://github.com/HKUDS/OpenGraph.
