Table of Contents
Fetching ...

Privacy Auditing of Multi-domain Graph Pre-trained Model under Membership Inference Attacks

Jiayi Luo, Qingyun Sun, Yuecen Wei, Haonan Yuan, Xingcheng Fu, Jianxin Li

TL;DR

This work investigates privacy risks of multi-domain graph pre-trained models under membership inference attacks. It reveals that existing MIAs are ineffective due to weak, embedding-based membership signals caused by broad cross-domain generalization. The authors propose Mgp-Mia, a three-part framework combining membership signal amplification via selective unlearning, incremental shadow model construction, and similarity-based inference to effectively infer node membership. Extensive experiments on both link-prediction and contrastive-learning based pre-trained models across five datasets show that Mgp-Mia significantly outperforms baselines, exposing practical privacy vulnerabilities in graph foundation models and underscoring the need for privacy-aware design.

Abstract

Multi-domain graph pre-training has emerged as a pivotal technique in developing graph foundation models. While it greatly improves the generalization of graph neural networks, its privacy risks under membership inference attacks (MIAs), which aim to identify whether a specific instance was used in training (member), remain largely unexplored. However, effectively conducting MIAs against multi-domain graph pre-trained models is a significant challenge due to: (i) Enhanced Generalization Capability: Multi-domain pre-training reduces the overfitting characteristics commonly exploited by MIAs. (ii) Unrepresentative Shadow Datasets: Diverse training graphs hinder the obtaining of reliable shadow graphs. (iii) Weakened Membership Signals: Embedding-based outputs offer less informative cues than logits for MIAs. To tackle these challenges, we propose MGP-MIA, a novel framework for Membership Inference Attacks against Multi-domain Graph Pre-trained models. Specifically, we first propose a membership signal amplification mechanism that amplifies the overfitting characteristics of target models via machine unlearning. We then design an incremental shadow model construction mechanism that builds a reliable shadow model with limited shadow graphs via incremental learning. Finally, we introduce a similarity-based inference mechanism that identifies members based on their similarity to positive and negative samples. Extensive experiments demonstrate the effectiveness of our proposed MGP-MIA and reveal the privacy risks of multi-domain graph pre-training.

Privacy Auditing of Multi-domain Graph Pre-trained Model under Membership Inference Attacks

TL;DR

This work investigates privacy risks of multi-domain graph pre-trained models under membership inference attacks. It reveals that existing MIAs are ineffective due to weak, embedding-based membership signals caused by broad cross-domain generalization. The authors propose Mgp-Mia, a three-part framework combining membership signal amplification via selective unlearning, incremental shadow model construction, and similarity-based inference to effectively infer node membership. Extensive experiments on both link-prediction and contrastive-learning based pre-trained models across five datasets show that Mgp-Mia significantly outperforms baselines, exposing practical privacy vulnerabilities in graph foundation models and underscoring the need for privacy-aware design.

Abstract

Multi-domain graph pre-training has emerged as a pivotal technique in developing graph foundation models. While it greatly improves the generalization of graph neural networks, its privacy risks under membership inference attacks (MIAs), which aim to identify whether a specific instance was used in training (member), remain largely unexplored. However, effectively conducting MIAs against multi-domain graph pre-trained models is a significant challenge due to: (i) Enhanced Generalization Capability: Multi-domain pre-training reduces the overfitting characteristics commonly exploited by MIAs. (ii) Unrepresentative Shadow Datasets: Diverse training graphs hinder the obtaining of reliable shadow graphs. (iii) Weakened Membership Signals: Embedding-based outputs offer less informative cues than logits for MIAs. To tackle these challenges, we propose MGP-MIA, a novel framework for Membership Inference Attacks against Multi-domain Graph Pre-trained models. Specifically, we first propose a membership signal amplification mechanism that amplifies the overfitting characteristics of target models via machine unlearning. We then design an incremental shadow model construction mechanism that builds a reliable shadow model with limited shadow graphs via incremental learning. Finally, we introduce a similarity-based inference mechanism that identifies members based on their similarity to positive and negative samples. Extensive experiments demonstrate the effectiveness of our proposed MGP-MIA and reveal the privacy risks of multi-domain graph pre-training.

Paper Structure

This paper contains 33 sections, 7 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: A comparison between graph MIAs against Traditional GNNs and Multi-domain Pre-trained GNNs.
  • Figure 2: Separability analysis of output node embeddings.
  • Figure 3: Robustness analysis of output node embeddings.
  • Figure 4: Overview of Mgp-Mia. The membership signal amplification mechanism first leverages machine unlearning to enhance overfitting. The shadow model is then built from the unlearned model using incremental learning. Finally, the attack features are derived from similarities between each sample and its positive and negative counterparts to train the attack model.
  • Figure 5: The ablation study of Mgp-Mia.
  • ...and 3 more figures