Table of Contents
Fetching ...

GFT: Graph Foundation Model with Transferable Tree Vocabulary

Zehong Wang, Zheyuan Zhang, Nitesh V Chawla, Chuxu Zhang, Yanfang Ye

TL;DR

A cross-task, cross-domain graph foundation model named GFT, short for Graph Foundation model with transferable Tree vocabulary is proposed, which improves model generalization and reduces the risk of negative transfer.

Abstract

Inspired by the success of foundation models in applications such as ChatGPT, as graph data has been ubiquitous, one can envision the far-reaching impacts that can be brought by Graph Foundation Models (GFMs) with broader applications in the areas such as scientific research, social network analysis, drug discovery, and e-commerce. Despite the significant progress of pre-trained graph neural networks, there haven't been GFMs that can achieve desired performance on various graph-learning-related tasks. Building GFMs may rely on a vocabulary that encodes transferable patterns shared among different tasks and domains. Unlike image and text, defining such transferable patterns for graphs remains an open question. In this paper, we aim to bridge this gap by rethinking the transferable patterns on graphs as computation trees -- i.e., tree structures derived from the message-passing process. Based on this insight, we propose a cross-task, cross-domain graph foundation model named GFT, short for Graph Foundation model with transferable Tree vocabulary. By treating computation trees as tokens within the transferable vocabulary, GFT improves model generalization and reduces the risk of negative transfer. The theoretical analyses and extensive experimental studies have demonstrated the transferability of computation trees and shown the effectiveness of GFT across diverse tasks and domains in graph learning. The open source code and data are available at https://github.com/Zehong-Wang/GFT.

GFT: Graph Foundation Model with Transferable Tree Vocabulary

TL;DR

A cross-task, cross-domain graph foundation model named GFT, short for Graph Foundation model with transferable Tree vocabulary is proposed, which improves model generalization and reduces the risk of negative transfer.

Abstract

Inspired by the success of foundation models in applications such as ChatGPT, as graph data has been ubiquitous, one can envision the far-reaching impacts that can be brought by Graph Foundation Models (GFMs) with broader applications in the areas such as scientific research, social network analysis, drug discovery, and e-commerce. Despite the significant progress of pre-trained graph neural networks, there haven't been GFMs that can achieve desired performance on various graph-learning-related tasks. Building GFMs may rely on a vocabulary that encodes transferable patterns shared among different tasks and domains. Unlike image and text, defining such transferable patterns for graphs remains an open question. In this paper, we aim to bridge this gap by rethinking the transferable patterns on graphs as computation trees -- i.e., tree structures derived from the message-passing process. Based on this insight, we propose a cross-task, cross-domain graph foundation model named GFT, short for Graph Foundation model with transferable Tree vocabulary. By treating computation trees as tokens within the transferable vocabulary, GFT improves model generalization and reduces the risk of negative transfer. The theoretical analyses and extensive experimental studies have demonstrated the transferability of computation trees and shown the effectiveness of GFT across diverse tasks and domains in graph learning. The open source code and data are available at https://github.com/Zehong-Wang/GFT.

Paper Structure

This paper contains 68 sections, 4 theorems, 30 equations, 12 figures, 24 tables.

Key Result

Theorem 2.2

Given two $L$-layer computation trees ${\mathcal{T}}_{v_1}, {\mathcal{T}}_{v_2}$ derived from the graph ${\mathcal{G}}$ and a GNN encoder $\phi$, the Euclidean distance between the tree embeddings $\Delta \triangleq \| \phi({\mathcal{T}}_{v_1}) - \phi({\mathcal{T}}_{v_2}) \|_2$ is bounded as follows where $\Delta_{v_1, v_2, j}^{L-1}$ represents the distance between the $(L-1)$-layer subtrees of th

Figures (12)

  • Figure 1: Graph tasks (top) and the corresponding computation trees (bottom). A virtual node can be added at the top to connect all task-relevant nodes, unifying different tasks as the tree-level task.
  • Figure 2: Synthetic graphs composed of two basic blocks. More blocks can scale up the graph sizes.
  • Figure 3: Transfer performance on synthetic graphs with $\mathcal{G}_1$ as the target graph. Higher tree similarity correlates with enhanced transferability.
  • Figure 4: During pre-training, GFT encodes general knowledge from a graph database into a tree vocabulary through tree reconstruction. In fine-tuning, the learned tree vocabulary is applied to unify graph-related tasks as tree classification, adapting the general knowledge to specific tasks.
  • Figure 5: Negative transfer gap on Cora in node classification.
  • ...and 7 more figures

Theorems & Definitions (11)

  • Definition 2.1: Computation Trees chuang2022tree
  • Theorem 2.2: Transferability of Computation Tree
  • proof
  • Remark 2.3
  • Theorem 3.1
  • Remark 3.2
  • Theorem D.1: Transferability of Computation Tree
  • proof
  • Theorem D.2
  • proof
  • ...and 1 more