Table of Contents
Fetching ...

Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models

Jiayi Luo, Qingyun Sun, Lingjuan Lyu, Ziwei Zhang, Haonan Yuan, Xingcheng Fu, Jianxin Li

TL;DR

Graph Foundation Models enable broad transfer across tasks but introduce backdoor risks during pre-training. The paper presents Gfm-Ba, a three-module backdoor attack comprising label-free trigger association via prototype embeddings, a node-adaptive trigger generator, and persistent anchoring to fine-tuning-insensitive parameters, enabling targeted manipulation that survives downstream adaptation. Across five datasets and three victim GFMs, Gfm-Ba achieves superior attack effectiveness, maintains clean performance, resists purification, and persists under fine-tuning, outperforming baselines in both targeted and non-targeted settings. This work exposes a critical security vulnerability in GFMs and motivates the development of defenses for pre-trained graph models and their downstream deployments.

Abstract

Graph Foundation Models (GFMs) are pre-trained on diverse source domains and adapted to unseen targets, enabling broad generalization for graph machine learning. Despite that GFMs have attracted considerable attention recently, their vulnerability to backdoor attacks remains largely underexplored. A compromised GFM can introduce backdoor behaviors into downstream applications, posing serious security risks. However, launching backdoor attacks against GFMs is non-trivial due to three key challenges. (1) Effectiveness: Attackers lack knowledge of the downstream task during pre-training, complicating the assurance that triggers reliably induce misclassifications into desired classes. (2) Stealthiness: The variability in node features across domains complicates trigger insertion that remains stealthy. (3) Persistence: Downstream fine-tuning may erase backdoor behaviors by updating model parameters. To address these challenges, we propose GFM-BA, a novel Backdoor Attack model against Graph Foundation Models. Specifically, we first design a label-free trigger association module that links the trigger to a set of prototype embeddings, eliminating the need for knowledge about downstream tasks to perform backdoor injection. Then, we introduce a node-adaptive trigger generator, dynamically producing node-specific triggers, reducing the risk of trigger detection while reliably activating the backdoor. Lastly, we develop a persistent backdoor anchoring module that firmly anchors the backdoor to fine-tuning-insensitive parameters, enhancing the persistence of the backdoor under downstream adaptation. Extensive experiments demonstrate the effectiveness, stealthiness, and persistence of GFM-BA.

Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models

TL;DR

Graph Foundation Models enable broad transfer across tasks but introduce backdoor risks during pre-training. The paper presents Gfm-Ba, a three-module backdoor attack comprising label-free trigger association via prototype embeddings, a node-adaptive trigger generator, and persistent anchoring to fine-tuning-insensitive parameters, enabling targeted manipulation that survives downstream adaptation. Across five datasets and three victim GFMs, Gfm-Ba achieves superior attack effectiveness, maintains clean performance, resists purification, and persists under fine-tuning, outperforming baselines in both targeted and non-targeted settings. This work exposes a critical security vulnerability in GFMs and motivates the development of defenses for pre-trained graph models and their downstream deployments.

Abstract

Graph Foundation Models (GFMs) are pre-trained on diverse source domains and adapted to unseen targets, enabling broad generalization for graph machine learning. Despite that GFMs have attracted considerable attention recently, their vulnerability to backdoor attacks remains largely underexplored. A compromised GFM can introduce backdoor behaviors into downstream applications, posing serious security risks. However, launching backdoor attacks against GFMs is non-trivial due to three key challenges. (1) Effectiveness: Attackers lack knowledge of the downstream task during pre-training, complicating the assurance that triggers reliably induce misclassifications into desired classes. (2) Stealthiness: The variability in node features across domains complicates trigger insertion that remains stealthy. (3) Persistence: Downstream fine-tuning may erase backdoor behaviors by updating model parameters. To address these challenges, we propose GFM-BA, a novel Backdoor Attack model against Graph Foundation Models. Specifically, we first design a label-free trigger association module that links the trigger to a set of prototype embeddings, eliminating the need for knowledge about downstream tasks to perform backdoor injection. Then, we introduce a node-adaptive trigger generator, dynamically producing node-specific triggers, reducing the risk of trigger detection while reliably activating the backdoor. Lastly, we develop a persistent backdoor anchoring module that firmly anchors the backdoor to fine-tuning-insensitive parameters, enhancing the persistence of the backdoor under downstream adaptation. Extensive experiments demonstrate the effectiveness, stealthiness, and persistence of GFM-BA.

Paper Structure

This paper contains 24 sections, 4 theorems, 17 equations, 5 figures, 4 tables, 1 algorithm.

Key Result

Proposition 1

When point density decays monotonically from each class centroid, increasing the separation between centroids raises the probability that FPS will cover more classes in a fixed number of steps.

Figures (5)

  • Figure 1: Key differences between backdoor attacks against traditional GNNs and GFMs.
  • Figure 2: The overall framework of Gfm-Ba. FPS is first applied to select prototype embeddings as trigger association targets. The node-adaptive trigger generator then produces personalized triggers conditioned on the target embedding and the target node, ensuring both stealthiness and effectiveness. Finally, graph mixup is used to identify fine-tuning-insensitive parameters, allowing the backdoor to be anchored in the stable regions of the model and remain effective after the downstream adaptation.
  • Figure 3: An analysis of the distribution of parameter update magnitudes after downstream fine-tuning.
  • Figure 4: Results of ablation studies on Photo and Computers. Gfm-Ba(w/o E) is evaluated in the target-controlled scenario to assess effectiveness without the label-free trigger association module. Gfm-Ba(w/o S) is tested in the target-uncontrolled scenario with graph purification to assess stealthiness. Gfm-Ba(w/o P) is tested in the target-uncontrolled scenario with a fine-tuned backbone to assess the backdoor persistence.
  • Figure 5: Hyperparameter study on the Photo dataset.

Theorems & Definitions (7)

  • Proposition 1
  • Proposition 2
  • Proposition A1
  • proof
  • Definition A1
  • Proposition A2
  • proof