Table of Contents
Fetching ...

GraphTheft: Quantifying Privacy Risks in Graph Prompt Learning

Jiani Zhu, Xi Lin, Yuxin Qi, Qinghua Mao

TL;DR

This study provides the first evaluation of privacy leakage in GPL across three attacker capabilities: black-box attacks when GPL as a service, scenarios where node embeddings and prompt representations are accessible to third parties, and scenarios where node embeddings and prompt representations are accessible to third parties.

Abstract

Graph Prompt Learning (GPL) represents an innovative approach in graph representation learning, enabling task-specific adaptations by fine-tuning prompts without altering the underlying pre-trained model. Despite its growing prominence, the privacy risks inherent in GPL remain unexplored. In this study, we provide the first evaluation of privacy leakage in GPL across three attacker capabilities: black-box attacks when GPL as a service, and scenarios where node embeddings and prompt representations are accessible to third parties. We assess GPL's privacy vulnerabilities through Attribute Inference Attacks (AIAs) and Link Inference Attacks (LIAs), finding that under any capability, attackers can effectively infer the properties and relationships of sensitive nodes, and the success rate of inference on some data sets is as high as 98%. Importantly, while targeted inference attacks on specific prompts (e.g., GPF-plus) maintain high success rates, our analysis suggests that the prompt-tuning in GPL does not significantly elevate privacy risks compared to traditional GNNs. To mitigate these risks, we explored defense mechanisms, identifying that Laplacian noise perturbation can substantially reduce inference success, though balancing privacy protection with model performance remains challenging. This work highlights critical privacy risks in GPL, offering new insights and foundational directions for future privacy-preserving strategies in graph learning.

GraphTheft: Quantifying Privacy Risks in Graph Prompt Learning

TL;DR

This study provides the first evaluation of privacy leakage in GPL across three attacker capabilities: black-box attacks when GPL as a service, scenarios where node embeddings and prompt representations are accessible to third parties, and scenarios where node embeddings and prompt representations are accessible to third parties.

Abstract

Graph Prompt Learning (GPL) represents an innovative approach in graph representation learning, enabling task-specific adaptations by fine-tuning prompts without altering the underlying pre-trained model. Despite its growing prominence, the privacy risks inherent in GPL remain unexplored. In this study, we provide the first evaluation of privacy leakage in GPL across three attacker capabilities: black-box attacks when GPL as a service, and scenarios where node embeddings and prompt representations are accessible to third parties. We assess GPL's privacy vulnerabilities through Attribute Inference Attacks (AIAs) and Link Inference Attacks (LIAs), finding that under any capability, attackers can effectively infer the properties and relationships of sensitive nodes, and the success rate of inference on some data sets is as high as 98%. Importantly, while targeted inference attacks on specific prompts (e.g., GPF-plus) maintain high success rates, our analysis suggests that the prompt-tuning in GPL does not significantly elevate privacy risks compared to traditional GNNs. To mitigate these risks, we explored defense mechanisms, identifying that Laplacian noise perturbation can substantially reduce inference success, though balancing privacy protection with model performance remains challenging. This work highlights critical privacy risks in GPL, offering new insights and foundational directions for future privacy-preserving strategies in graph learning.

Paper Structure

This paper contains 28 sections, 10 equations, 20 figures, 6 tables.

Figures (20)

  • Figure 1: An overview of inference attacks on GPL. Graph prompting follows the 'pre-train, prompt' paradigm by freezing the pre-trained model to adjust the prompts to the downstream task. We assume two attack scenarios: (1) When the GPL is used as a service, an attacker can launch a black box attack; (2) The attacker launches inference attacks directly against the node embeddings and prompts shared by the three parties.
  • Figure 2: Schematic overview of inference attacks in GPL.
  • Figure 3: Connectivity Matrix between Classes in Four Datasets
  • Figure 4: Attack performance of different attack models on node posteriors
  • Figure 5: Performance of five GPL models under different pretraining methods
  • ...and 15 more figures