Table of Contents
Fetching ...

UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs

Yufei He, Yuan Sui, Xiaoxin He, Bryan Hooi

TL;DR

This work tackles cross-domain graph learning by treating text as a unifying channel through Text-Attributed Graphs (TAGs) and introducing UniGraph, a foundation model trained with a cascaded LM-GNN backbone and Masked Graph Modeling objectives. It advances a universal task-unification framework using Anchor Nodes and Personalized PageRank sampling, a Graph Siamese Masked Autoencoder for self-supervised pretraining, and graph instruction tuning to enable zero-shot predictions via natural language. Across 11 TAG datasets spanning 5 domains, UniGraph demonstrates strong self-supervised representation learning, few-shot in-context transfer, and zero-shot transfer, often matching or surpassing supervised targets and prior cross-domain methods. The combination of end-to-end cross-domain learning, instruction-tuned zero-shot capability, and scalable pretraining positions UniGraph as a practical and impactful foundation model for TAGs with broad applicability in real-world graph learning tasks.

Abstract

Foundation models like ChatGPT and GPT-4 have revolutionized artificial intelligence, exhibiting remarkable abilities to generalize across a wide array of tasks and applications beyond their initial training objectives. However, graph learning has predominantly focused on single-graph models, tailored to specific tasks or datasets, lacking the ability to transfer learned knowledge to different domains. This limitation stems from the inherent complexity and diversity of graph structures, along with the different feature and label spaces specific to graph data. In this paper, we recognize text as an effective unifying medium and employ Text-Attributed Graphs (TAGs) to leverage this potential. We present our UniGraph framework, designed to learn a foundation model for TAGs, which is capable of generalizing to unseen graphs and tasks across diverse domains. Unlike single-graph models that use pre-computed node features of varying dimensions as input, our approach leverages textual features for unifying node representations, even for graphs such as molecular graphs that do not naturally have textual features. We propose a novel cascaded architecture of Language Models (LMs) and Graph Neural Networks (GNNs) as backbone networks. Additionally, we propose the first pre-training algorithm specifically designed for large-scale self-supervised learning on TAGs, based on Masked Graph Modeling. We introduce graph instruction tuning using Large Language Models (LLMs) to enable zero-shot prediction ability. Our comprehensive experiments across various graph learning tasks and domains demonstrate the model's effectiveness in self-supervised representation learning on unseen graphs, few-shot in-context transfer, and zero-shot transfer, even surpassing or matching the performance of GNNs that have undergone supervised training on target datasets.

UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs

TL;DR

This work tackles cross-domain graph learning by treating text as a unifying channel through Text-Attributed Graphs (TAGs) and introducing UniGraph, a foundation model trained with a cascaded LM-GNN backbone and Masked Graph Modeling objectives. It advances a universal task-unification framework using Anchor Nodes and Personalized PageRank sampling, a Graph Siamese Masked Autoencoder for self-supervised pretraining, and graph instruction tuning to enable zero-shot predictions via natural language. Across 11 TAG datasets spanning 5 domains, UniGraph demonstrates strong self-supervised representation learning, few-shot in-context transfer, and zero-shot transfer, often matching or surpassing supervised targets and prior cross-domain methods. The combination of end-to-end cross-domain learning, instruction-tuned zero-shot capability, and scalable pretraining positions UniGraph as a practical and impactful foundation model for TAGs with broad applicability in real-world graph learning tasks.

Abstract

Foundation models like ChatGPT and GPT-4 have revolutionized artificial intelligence, exhibiting remarkable abilities to generalize across a wide array of tasks and applications beyond their initial training objectives. However, graph learning has predominantly focused on single-graph models, tailored to specific tasks or datasets, lacking the ability to transfer learned knowledge to different domains. This limitation stems from the inherent complexity and diversity of graph structures, along with the different feature and label spaces specific to graph data. In this paper, we recognize text as an effective unifying medium and employ Text-Attributed Graphs (TAGs) to leverage this potential. We present our UniGraph framework, designed to learn a foundation model for TAGs, which is capable of generalizing to unseen graphs and tasks across diverse domains. Unlike single-graph models that use pre-computed node features of varying dimensions as input, our approach leverages textual features for unifying node representations, even for graphs such as molecular graphs that do not naturally have textual features. We propose a novel cascaded architecture of Language Models (LMs) and Graph Neural Networks (GNNs) as backbone networks. Additionally, we propose the first pre-training algorithm specifically designed for large-scale self-supervised learning on TAGs, based on Masked Graph Modeling. We introduce graph instruction tuning using Large Language Models (LLMs) to enable zero-shot prediction ability. Our comprehensive experiments across various graph learning tasks and domains demonstrate the model's effectiveness in self-supervised representation learning on unseen graphs, few-shot in-context transfer, and zero-shot transfer, even surpassing or matching the performance of GNNs that have undergone supervised training on target datasets.
Paper Structure (21 sections, 9 equations, 1 figure, 12 tables)

This paper contains 21 sections, 9 equations, 1 figure, 12 tables.

Figures (1)

  • Figure 1: Overview of UniGraph framework. 1) In pre-training, we employ a self-supervised approach, leveraging TAGs to unify diverse graph data. This phase involves a cascaded architecture combining LMs and GNNs. We propose Graph Siamese Masked Autoencoders as the training architecture, which learns to reconstruct the masked text of each node using the text of its neighbors. 2) In few-shot transfer, the pre-trained model can make predictions with minimal data by comparing the embeddings of the query and support graphs. 3) Zero-shot transfer is achieved through graph instruction tuning with LLMs, enabling it to understand category labels in natural language and make predictions on unseen graphs without any graph-specific training.

Theorems & Definitions (1)

  • definition 1