Table of Contents
Fetching ...

PSP: Pre-Training and Structure Prompt Tuning for Graph Neural Networks

Qingqing Ge, Zeyuan Zhao, Yiding Liu, Anfeng Cheng, Xiang Li, Shuaiqiang Wang, Dawei Yin

TL;DR

PSP addresses the gap between graph pre-training and downstream prompting by introducing dual-view contrastive pre-training that separately encodes node attributes and graph structure, and a structure prompt tuning mechanism that connects learned class prototypes to the graph via learnable edges. The method freezes the pre-trained encoders while learning a small set of structure-based parameters to aggregate information from massive unlabeled data, improving prototype quality for few-shot node and graph classification, and is effective for both homophilous and heterophilous graphs. Empirical results across 11 datasets show PSP achieving superior performance over baselines, with notable gains in few-shot settings, and ablations confirm the usefulness of the structure prompt and the dual-view design. The work provides a practical, scalable approach to leverage pre-trained GNN knowledge for robust, data-efficient graph reasoning, with code available at the authors' repository.

Abstract

Graph Neural Networks (GNNs) are powerful in learning semantics of graph data. Recently, a new paradigm "pre-train and prompt" has shown promising results in adapting GNNs to various tasks with less supervised data. The success of such paradigm can be attributed to the more consistent objectives of pre-training and task-oriented prompt tuning, where the pre-trained knowledge can be effectively transferred to downstream tasks. Most existing methods are based on the class prototype vector framework. However, in the few-shot scenarios, given few labeled data, class prototype vectors are difficult to be accurately constructed or learned. Meanwhile, the structure information of graph is usually exploited during pre-training for learning node representations, while neglected in the prompt tuning stage for learning more accurate prototype vectors. In addition, they generally ignore the impact of heterophilous neighborhoods on node representation and are not suitable for heterophilous graphs. To bridge these gaps, we propose a novel pre-training and structure prompt tuning framework for GNNs, namely PSP, which consistently exploits structure information in both pre-training and prompt tuning stages. In particular, PSP 1) employs a dual-view contrastive learning to align the latent semantic spaces of node attributes and graph structure, and 2) incorporates structure information in prompted graph to construct more accurate prototype vectors and elicit more pre-trained knowledge in prompt tuning. We conduct extensive experiments on node classification and graph classification tasks to evaluate the effectiveness of PSP. We show that PSP can lead to superior performance in few-shot scenarios on both homophilous and heterophilous graphs. The implemented code is available at https://github.com/gqq1210/PSP.

PSP: Pre-Training and Structure Prompt Tuning for Graph Neural Networks

TL;DR

PSP addresses the gap between graph pre-training and downstream prompting by introducing dual-view contrastive pre-training that separately encodes node attributes and graph structure, and a structure prompt tuning mechanism that connects learned class prototypes to the graph via learnable edges. The method freezes the pre-trained encoders while learning a small set of structure-based parameters to aggregate information from massive unlabeled data, improving prototype quality for few-shot node and graph classification, and is effective for both homophilous and heterophilous graphs. Empirical results across 11 datasets show PSP achieving superior performance over baselines, with notable gains in few-shot settings, and ablations confirm the usefulness of the structure prompt and the dual-view design. The work provides a practical, scalable approach to leverage pre-trained GNN knowledge for robust, data-efficient graph reasoning, with code available at the authors' repository.

Abstract

Graph Neural Networks (GNNs) are powerful in learning semantics of graph data. Recently, a new paradigm "pre-train and prompt" has shown promising results in adapting GNNs to various tasks with less supervised data. The success of such paradigm can be attributed to the more consistent objectives of pre-training and task-oriented prompt tuning, where the pre-trained knowledge can be effectively transferred to downstream tasks. Most existing methods are based on the class prototype vector framework. However, in the few-shot scenarios, given few labeled data, class prototype vectors are difficult to be accurately constructed or learned. Meanwhile, the structure information of graph is usually exploited during pre-training for learning node representations, while neglected in the prompt tuning stage for learning more accurate prototype vectors. In addition, they generally ignore the impact of heterophilous neighborhoods on node representation and are not suitable for heterophilous graphs. To bridge these gaps, we propose a novel pre-training and structure prompt tuning framework for GNNs, namely PSP, which consistently exploits structure information in both pre-training and prompt tuning stages. In particular, PSP 1) employs a dual-view contrastive learning to align the latent semantic spaces of node attributes and graph structure, and 2) incorporates structure information in prompted graph to construct more accurate prototype vectors and elicit more pre-trained knowledge in prompt tuning. We conduct extensive experiments on node classification and graph classification tasks to evaluate the effectiveness of PSP. We show that PSP can lead to superior performance in few-shot scenarios on both homophilous and heterophilous graphs. The implemented code is available at https://github.com/gqq1210/PSP.
Paper Structure (20 sections, 8 equations, 5 figures, 5 tables)

This paper contains 20 sections, 8 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: The construction of class prototype vectors. The colored areas contain the labeled nodes for training. The circles represent nodes, and the triangles represent class prototype vectors for node classification task. The solid black lines and gray dashed lines denote the original edges in the graph and the new weighted edges, respectively. For each node, the dashed line in red or green denotes the edges with the largest weight to the class prototype vector.
  • Figure 2: Overall framework of PSP. Top: pre-training. Middle: prompt tuning for node classification. Bottom: prompt tuning for graph classification.
  • Figure 3: The ablation study on training paradigm.
  • Figure 4: (a)(b): varying the number of shots for node classification. (c)(d): varying the number of shots for graph classification.
  • Figure 5: The weights of added edges between nodes and class prototype vectors before and after prompt tuning.