RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning

Jiapeng Zhu; Zichen Ding; Jianxiang Yu; Jiaqi Tan; Xiang Li; Weining Qian

RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning

Jiapeng Zhu, Zichen Ding, Jianxiang Yu, Jiaqi Tan, Xiang Li, Weining Qian

TL;DR

RELIEF introduces a reinforcement learning framework to graph feature prompt tuning, treating the insertion of node-level prompts as a sequential decision problem with a discrete-continuous action space. By using hybrid action PPO and policy generalization (LEEP), RELIEF learns when and where to attach lightweight prompts to improve downstream GNN performance across multiple pre-training strategies in few-shot settings. The approach demonstrates superior accuracy and data efficiency on graph and node classification tasks, while measuring the impact of prompts via PCR and APM. This work provides a scalable, generalizable, and data-efficient alternative to fine-tuning and existing prompt methods for pre-trained GNNs, with broad applicability across graph domains.

Abstract

The advent of the "pre-train, prompt" paradigm has recently extended its generalization ability and data efficiency to graph representation learning, following its achievements in Natural Language Processing (NLP). Initial graph prompt tuning approaches tailored specialized prompting functions for Graph Neural Network (GNN) models pre-trained with specific strategies, such as edge prediction, thus limiting their applicability. In contrast, another pioneering line of research has explored universal prompting via adding prompts to the input graph's feature space, thereby removing the reliance on specific pre-training strategies. However, the necessity to add feature prompts to all nodes remains an open question. Motivated by findings from prompt tuning research in the NLP domain, which suggest that highly capable pre-trained models need less conditioning signal to achieve desired behaviors, we advocate for strategically incorporating necessary and lightweight feature prompts to certain graph nodes to enhance downstream task performance. This introduces a combinatorial optimization problem, requiring a policy to decide 1) which nodes to prompt and 2) what specific feature prompts to attach. We then address the problem by framing the prompt incorporation process as a sequential decision-making problem and propose our method, RELIEF, which employs Reinforcement Learning (RL) to optimize it. At each step, the RL agent selects a node (discrete action) and determines the prompt content (continuous action), aiming to maximize cumulative performance gain. Extensive experiments on graph and node-level tasks with various pre-training strategies in few-shot scenarios demonstrate that our RELIEF outperforms fine-tuning and other prompt-based approaches in classification performance and data efficiency. The code is available at https://github.com/JasonZhujp/RELIEF.

RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning

TL;DR

Abstract

Paper Structure (54 sections, 17 equations, 6 figures, 8 tables)

This paper contains 54 sections, 17 equations, 6 figures, 8 tables.

Introduction
Preliminaries
Fine-tuning and Graph Prompt Tuning.
Graph Feature Prompt Tuning.
RL with Hybrid Action Space.
Method
Incorporating Feature Prompts as MDP
Action Space.
State Transition.
Reward Function.
Policy Network Architecture
Discrete Actor
Continuous Actor
Critic
Overall Framework of RELIEF
...and 39 more sections

Figures (6)

Figure 1: A Comparison of Tuning Methods. Fine-tuning (upper left) updates the parameters of the pre-trained GNN model. Pre-training-dependent prompt tuning (right) freezes the GNN model and requires designing specialized prompt templates aligned with pre-training strategies, whereas feature prompt tuning (lower left) is applicable to any pre-training strategy.
Figure 2: RELIEF pipeline. The policy network (upper) and the projection head (lower) are trained alternately. Feature prompts are incorporated during the agent sampling and prompt addition processes via discrete and continuous actors.
Figure 3: Tuning process of RELIEF on BACE. (a) and (b) present the ROC-AUC and average reward curves, respectively, for the training, validation, and testing sets across 5 random seeds. (c) and (d) illustrate the reward distributions for the training and testing sets as the tuning epochs progress, with white vertical lines indicating the average rewards for each distribution.
Figure 4: Performance of feature prompting methods with training data scaling up, compared to full-shot FT.
Figure 5: Effectiveness of discrete and continuous policies.
...and 1 more figures

RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning

TL;DR

Abstract

RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)