Table of Contents
Fetching ...

MADE: Graph Backdoor Defense with Masked Unlearning

Xiao Lin, Mingjie Li, Yisen Wang

TL;DR

This work tackles backdoor threats in graph neural networks by introducing MADE, a training-time defense that does not require extra clean data. MADE combines data isolation via homophily- and loss-based detection with a learnable edge-masking mechanism in forward propagation, using adversarial and natural losses to unlearn triggers while preserving clean performance. The approach demonstrates strong backdoor suppression (near-zero ASR) and competitive accuracy across graph and node classification tasks, across multiple datasets and GNN architectures, and shows robustness to varying attack strengths. By leveraging graph topology and learnable masks, MADE offers a practical, scalable defense that can be extended to node-level defenses and avoids heavy data purification requirements.

Abstract

Graph Neural Networks (GNNs) have garnered significant attention from researchers due to their outstanding performance in handling graph-related tasks, such as social network analysis, protein design, and so on. Despite their widespread application, recent research has demonstrated that GNNs are vulnerable to backdoor attacks, implemented by injecting triggers into the training datasets. Trained on the poisoned data, GNNs will predict target labels when attaching trigger patterns to inputs. This vulnerability poses significant security risks for applications of GNNs in sensitive domains, such as drug discovery. While there has been extensive research into backdoor defenses for images, strategies to safeguard GNNs against such attacks remain underdeveloped. Furthermore, we point out that conventional backdoor defense methods designed for images cannot work well when directly implemented on graph data. In this paper, we first analyze the key difference between image backdoor and graph backdoor attacks. Then we tackle the graph defense problem by presenting a novel approach called MADE, which devises an adversarial mask generation mechanism that selectively preserves clean sub-graphs and further leverages masks on edge weights to eliminate the influence of triggers effectively. Extensive experiments across various graph classification tasks demonstrate the effectiveness of MADE in significantly reducing the attack success rate (ASR) while maintaining a high classification accuracy.

MADE: Graph Backdoor Defense with Masked Unlearning

TL;DR

This work tackles backdoor threats in graph neural networks by introducing MADE, a training-time defense that does not require extra clean data. MADE combines data isolation via homophily- and loss-based detection with a learnable edge-masking mechanism in forward propagation, using adversarial and natural losses to unlearn triggers while preserving clean performance. The approach demonstrates strong backdoor suppression (near-zero ASR) and competitive accuracy across graph and node classification tasks, across multiple datasets and GNN architectures, and shows robustness to varying attack strengths. By leveraging graph topology and learnable masks, MADE offers a practical, scalable defense that can be extended to node-level defenses and avoids heavy data purification requirements.

Abstract

Graph Neural Networks (GNNs) have garnered significant attention from researchers due to their outstanding performance in handling graph-related tasks, such as social network analysis, protein design, and so on. Despite their widespread application, recent research has demonstrated that GNNs are vulnerable to backdoor attacks, implemented by injecting triggers into the training datasets. Trained on the poisoned data, GNNs will predict target labels when attaching trigger patterns to inputs. This vulnerability poses significant security risks for applications of GNNs in sensitive domains, such as drug discovery. While there has been extensive research into backdoor defenses for images, strategies to safeguard GNNs against such attacks remain underdeveloped. Furthermore, we point out that conventional backdoor defense methods designed for images cannot work well when directly implemented on graph data. In this paper, we first analyze the key difference between image backdoor and graph backdoor attacks. Then we tackle the graph defense problem by presenting a novel approach called MADE, which devises an adversarial mask generation mechanism that selectively preserves clean sub-graphs and further leverages masks on edge weights to eliminate the influence of triggers effectively. Extensive experiments across various graph classification tasks demonstrate the effectiveness of MADE in significantly reducing the attack success rate (ASR) while maintaining a high classification accuracy.

Paper Structure

This paper contains 30 sections, 20 equations, 12 figures, 7 tables, 2 algorithms.

Figures (12)

  • Figure 1: GCN
  • Figure 2: GAT
  • Figure 4: Unlearning performance w.r.t. different unlearn rates on PROTEINS. The unlearn rates denote the proportion of backdoor samples in the poisoned datasets for unlearning. The orange curve indicates the accuracy (ACC), while the blue curve represents the attack success rate (ASR).
  • Figure 5: Differences in the response of models attacked by image backdoor and graph backdoor attacks. In Figure 3(a), the left sub-figure displays a poisoned image with a trigger injected in the lower right corner, while the right sub-figure shows a heatmap generated from one attacked model on this poisoned image. Figure 3(b) showcases the gradients from an attacked GNN across different types of nodes.
  • Figure 6: PROTEINS.
  • ...and 7 more figures