Table of Contents
Fetching ...

GraphCLIP: Enhancing Transferability in Graph Foundation Models for Text-Attributed Graphs

Yun Zhu, Haizhou Shi, Xiaotang Wang, Yongchao Liu, Yaoke Wang, Boci Peng, Chuntao Hong, Siliang Tang

TL;DR

Text-Attributed Graphs (TAGs) face labeling bottlenecks and weak cross-domain transfer. GraphCLIP proposes a self-supervised graph-summary pretraining framework that pairs LLM-generated graph summaries with graph encodings, augmented by invariant learning to boost cross-domain generalization. It also introduces graph-prompt tuning for efficient few-shot adaptation. Across diverse TAG tasks, GraphCLIP demonstrates strong zero-shot and few-shot performance and broad applicability to downstream tasks.

Abstract

Recently, research on Text-Attributed Graphs (TAGs) has gained significant attention due to the prevalence of free-text node features in real-world applications and the advancements in Large Language Models (LLMs) that bolster TAG methodologies. However, current TAG approaches face two primary challenges: (i) Heavy reliance on label information and (ii) Limited cross-domain zero/few-shot transferability. These issues constrain the scaling of both data and model size, owing to high labor costs and scaling laws, complicating the development of graph foundation models with strong transferability. In this work, we propose the GraphCLIP framework to address these challenges by learning graph foundation models with strong cross-domain zero/few-shot transferability through a self-supervised contrastive graph-summary pretraining method. Specifically, we generate and curate large-scale graph-summary pair data with the assistance of LLMs, and introduce a novel graph-summary pretraining method, combined with invariant learning, to enhance graph foundation models with strong cross-domain zero-shot transferability. For few-shot learning, we propose a novel graph prompt tuning technique aligned with our pretraining objective to mitigate catastrophic forgetting and minimize learning costs. Extensive experiments show the superiority of GraphCLIP in both zero-shot and few-shot settings, while evaluations across various downstream tasks confirm the versatility of GraphCLIP. Our code is available at: https://github.com/ZhuYun97/GraphCLIP

GraphCLIP: Enhancing Transferability in Graph Foundation Models for Text-Attributed Graphs

TL;DR

Text-Attributed Graphs (TAGs) face labeling bottlenecks and weak cross-domain transfer. GraphCLIP proposes a self-supervised graph-summary pretraining framework that pairs LLM-generated graph summaries with graph encodings, augmented by invariant learning to boost cross-domain generalization. It also introduces graph-prompt tuning for efficient few-shot adaptation. Across diverse TAG tasks, GraphCLIP demonstrates strong zero-shot and few-shot performance and broad applicability to downstream tasks.

Abstract

Recently, research on Text-Attributed Graphs (TAGs) has gained significant attention due to the prevalence of free-text node features in real-world applications and the advancements in Large Language Models (LLMs) that bolster TAG methodologies. However, current TAG approaches face two primary challenges: (i) Heavy reliance on label information and (ii) Limited cross-domain zero/few-shot transferability. These issues constrain the scaling of both data and model size, owing to high labor costs and scaling laws, complicating the development of graph foundation models with strong transferability. In this work, we propose the GraphCLIP framework to address these challenges by learning graph foundation models with strong cross-domain zero/few-shot transferability through a self-supervised contrastive graph-summary pretraining method. Specifically, we generate and curate large-scale graph-summary pair data with the assistance of LLMs, and introduce a novel graph-summary pretraining method, combined with invariant learning, to enhance graph foundation models with strong cross-domain zero-shot transferability. For few-shot learning, we propose a novel graph prompt tuning technique aligned with our pretraining objective to mitigate catastrophic forgetting and minimize learning costs. Extensive experiments show the superiority of GraphCLIP in both zero-shot and few-shot settings, while evaluations across various downstream tasks confirm the versatility of GraphCLIP. Our code is available at: https://github.com/ZhuYun97/GraphCLIP

Paper Structure

This paper contains 48 sections, 2 theorems, 19 equations, 4 figures, 8 tables.

Key Result

Proposition B.1

In a binary classification setting, let $(Z_1, Z_2)$ be normally distributed as $\mathcal{N}(0, I_2)$. Assign $Y=1$ if $Z_1 \geq 0$. For data augmentation, $Z_2$ is scaled by a normal distribution: This creates a set of transformation-induced domains $\mathcal{B}=\{\mathcal{G}_m: \mathcal{G}_m=(Z_1, m \cdot Z_2) \, | \, m \in \mathbb{R}\}$. For any $\zeta \ge 0$, there is a representation $g$ and

Figures (4)

  • Figure 1: Three main categories of TAG methods.
  • Figure 2: Our proposed GraphCLIP Framework: (a) represents the self-supervised pretraining method we designed, (b) denotes zero-shot learning of GraphCLIP, and (c) refers to our graph prompt tuning method on target data.
  • Figure 3: Node classification of different graph prompt tuning techniques under few-shot setting.
  • Figure 4: Analyzing the impact of source data on the performance of target datasets.

Theorems & Definitions (5)

  • Definition 1: Contrastive Loss cpc
  • Definition 2: Invariant learning irm
  • Definition 3: Invariant Alignment Loss mario
  • Proposition B.1
  • Theorem C.1: Upper Bound on Variation Across Different Domains arcl