Table of Contents
Fetching ...

HetGPT: Harnessing the Power of Prompt Tuning in Pre-Trained Heterogeneous Graph Neural Networks

Yihong Ma, Ning Yan, Jiayu Li, Masood Mortazavi, Nitesh V. Chawla

TL;DR

HetGPT proposes a novel post-training prompting framework for heterogeneous graph neural networks to address negative transfer and label scarcity in semi-supervised node classification. By introducing a virtual class prompt and a heterogeneous feature prompt, along with a multi-view neighborhood aggregation, HetGPT reframes downstream tasks to align with contrastive pre-training objectives while keeping the HGNN backbone frozen. Empirical results on ACM, DBLP, and IMDB show consistent improvements over state-of-the-art HGNNs and pre-training baselines, with notable gains in few-shot settings and faster convergence than fine-tuning. The approach offers practical benefits for leveraging large pre-trained HGNNs in real-world web-scale graphs and opens avenues for extending prompting techniques to other graph tasks and imbalance scenarios.

Abstract

Graphs have emerged as a natural choice to represent and analyze the intricate patterns and rich information of the Web, enabling applications such as online page classification and social recommendation. The prevailing "pre-train, fine-tune" paradigm has been widely adopted in graph machine learning tasks, particularly in scenarios with limited labeled nodes. However, this approach often exhibits a misalignment between the training objectives of pretext tasks and those of downstream tasks. This gap can result in the "negative transfer" problem, wherein the knowledge gained from pre-training adversely affects performance in the downstream tasks. The surge in prompt-based learning within Natural Language Processing (NLP) suggests the potential of adapting a "pre-train, prompt" paradigm to graphs as an alternative. However, existing graph prompting techniques are tailored to homogeneous graphs, neglecting the inherent heterogeneity of Web graphs. To bridge this gap, we propose HetGPT, a general post-training prompting framework to improve the predictive performance of pre-trained heterogeneous graph neural networks (HGNNs). The key is the design of a novel prompting function that integrates a virtual class prompt and a heterogeneous feature prompt, with the aim to reformulate downstream tasks to mirror pretext tasks. Moreover, HetGPT introduces a multi-view neighborhood aggregation mechanism, capturing the complex neighborhood structure in heterogeneous graphs. Extensive experiments on three benchmark datasets demonstrate HetGPT's capability to enhance the performance of state-of-the-art HGNNs on semi-supervised node classification.

HetGPT: Harnessing the Power of Prompt Tuning in Pre-Trained Heterogeneous Graph Neural Networks

TL;DR

HetGPT proposes a novel post-training prompting framework for heterogeneous graph neural networks to address negative transfer and label scarcity in semi-supervised node classification. By introducing a virtual class prompt and a heterogeneous feature prompt, along with a multi-view neighborhood aggregation, HetGPT reframes downstream tasks to align with contrastive pre-training objectives while keeping the HGNN backbone frozen. Empirical results on ACM, DBLP, and IMDB show consistent improvements over state-of-the-art HGNNs and pre-training baselines, with notable gains in few-shot settings and faster convergence than fine-tuning. The approach offers practical benefits for leveraging large pre-trained HGNNs in real-world web-scale graphs and opens avenues for extending prompting techniques to other graph tasks and imbalance scenarios.

Abstract

Graphs have emerged as a natural choice to represent and analyze the intricate patterns and rich information of the Web, enabling applications such as online page classification and social recommendation. The prevailing "pre-train, fine-tune" paradigm has been widely adopted in graph machine learning tasks, particularly in scenarios with limited labeled nodes. However, this approach often exhibits a misalignment between the training objectives of pretext tasks and those of downstream tasks. This gap can result in the "negative transfer" problem, wherein the knowledge gained from pre-training adversely affects performance in the downstream tasks. The surge in prompt-based learning within Natural Language Processing (NLP) suggests the potential of adapting a "pre-train, prompt" paradigm to graphs as an alternative. However, existing graph prompting techniques are tailored to homogeneous graphs, neglecting the inherent heterogeneity of Web graphs. To bridge this gap, we propose HetGPT, a general post-training prompting framework to improve the predictive performance of pre-trained heterogeneous graph neural networks (HGNNs). The key is the design of a novel prompting function that integrates a virtual class prompt and a heterogeneous feature prompt, with the aim to reformulate downstream tasks to mirror pretext tasks. Moreover, HetGPT introduces a multi-view neighborhood aggregation mechanism, capturing the complex neighborhood structure in heterogeneous graphs. Extensive experiments on three benchmark datasets demonstrate HetGPT's capability to enhance the performance of state-of-the-art HGNNs on semi-supervised node classification.
Paper Structure (30 sections, 21 equations, 5 figures, 2 tables)

This paper contains 30 sections, 21 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overview of the HetGPT architecture: Initially, an HGNN is pre-trained alongside a contrastive head using a contrastive learning objective, after which their parameters are frozen. Following this, a heterogeneous feature prompt(Sec. \ref{['subsec:featureprompt']}) is injected into the input graph's feature space. These prompted node features are then processed by the pre-trained HGNN, producing the prompted node embeddings. Next, a multi-view neighborhood aggregation mechanism (Sec. \ref{['subsec:aggregation']}) captures both local and global heterogeneous neighborhood information of the target node, generating a node token. Finally, pairwise similarity comparisons are performed between this node token and class tokens derived from the virtual class prompt(Sec. \ref{['subsec:classprompt']}) via the same contrastive learning objective from pre-training. As an illustrative example of employing HetGPT for node classification: consider a target node $P_2$ associated with class $1$, its positive samples during prompt tuning are constructed using the class token of class $1$, while negative samples are drawn from class tokens of classes $2$ and $3$ (i.e., all remaining classes).
  • Figure 2: Ablation study of HetGPT on ACM and IMDB.
  • Figure 3: Performance of HetGPT with the different number of basis feature vectors on ACM, DBLP, and IMDB.
  • Figure 4: Comparison of training losses over epochs between HetGPT and its fine-tuning counterpart on DBLP and IMDB.
  • Figure 5: Visualization of the learned node tokens and class tokens in virtual class prompt on ACM and DBLP.

Theorems & Definitions (5)

  • Definition 1: Heterogeneous graph
  • Definition 2: Network schema
  • Definition 3: Metapath
  • Definition 4: Semi-supervised node classification
  • Definition 5: Pre-train, fine-tune