Table of Contents
Fetching ...

Grimm: A Plug-and-Play Perturbation Rectifier for Graph Neural Networks Defending against Poisoning Attacks

Ao Liu, Wenshan Li, Beibei Li, Wengang Ma, Tao Li, Pan Zhou

TL;DR

This work tackles the vulnerability of graph neural networks to poisoning attacks and the limitation of defenses that replace the original GNNs. It proposes Grimm, a plug-and-play, HIS-inspired artificial immune system that attaches to any GNN layer, monitors feature trajectories, and detects abnormal edge communications to rectify perturbations in parallel with training. The approach combines a cyclic self-supervised generator to create feasible FT detectors and a negative selection algorithm to prune detectors, yielding transferable, low-overhead detectors that can operate across different GNN architectures (GCN, GAT, GraphSAGE). Theoretical foundations establish discriminability of attacked vs. non-attacked FT patterns and provide bounds on FT inner products to support robust detection. Empirical results on multiple datasets and attacks demonstrate Grimm’s superior accuracy, scalability, and transferability, highlighting its practical impact for secure, scalable graph learning in real-world scenarios.

Abstract

Recent studies have revealed the vulnerability of graph neural networks (GNNs) to adversarial poisoning attacks on node classification tasks. Current defensive methods require substituting the original GNNs with defense models, regardless of the original's type. This approach, while targeting adversarial robustness, compromises the enhancements developed in prior research to boost GNNs' practical performance. Here we introduce Grimm, the first plug-and-play defense model. With just a minimal interface requirement for extracting features from any layer of the protected GNNs, Grimm is thus enabled to seamlessly rectify perturbations. Specifically, we utilize the feature trajectories (FTs) generated by GNNs, as they evolve through epochs, to reflect the training status of the networks. We then theoretically prove that the FTs of victim nodes will inevitably exhibit discriminable anomalies. Consequently, inspired by the natural parallelism between the biological nervous and immune systems, we construct Grimm, a comprehensive artificial immune system for GNNs. Grimm not only detects abnormal FTs and rectifies adversarial edges during training but also operates efficiently in parallel, thereby mirroring the concurrent functionalities of its biological counterparts. We experimentally confirm that Grimm offers four empirically validated advantages: 1) Harmlessness, as it does not actively interfere with GNN training; 2) Parallelism, ensuring monitoring, detection, and rectification functions operate independently of the GNN training process; 3) Generalizability, demonstrating compatibility with mainstream GNNs such as GCN, GAT, and GraphSAGE; and 4) Transferability, as the detectors for abnormal FTs can be efficiently transferred across different systems for one-step rectification.

Grimm: A Plug-and-Play Perturbation Rectifier for Graph Neural Networks Defending against Poisoning Attacks

TL;DR

This work tackles the vulnerability of graph neural networks to poisoning attacks and the limitation of defenses that replace the original GNNs. It proposes Grimm, a plug-and-play, HIS-inspired artificial immune system that attaches to any GNN layer, monitors feature trajectories, and detects abnormal edge communications to rectify perturbations in parallel with training. The approach combines a cyclic self-supervised generator to create feasible FT detectors and a negative selection algorithm to prune detectors, yielding transferable, low-overhead detectors that can operate across different GNN architectures (GCN, GAT, GraphSAGE). Theoretical foundations establish discriminability of attacked vs. non-attacked FT patterns and provide bounds on FT inner products to support robust detection. Empirical results on multiple datasets and attacks demonstrate Grimm’s superior accuracy, scalability, and transferability, highlighting its practical impact for secure, scalable graph learning in real-world scenarios.

Abstract

Recent studies have revealed the vulnerability of graph neural networks (GNNs) to adversarial poisoning attacks on node classification tasks. Current defensive methods require substituting the original GNNs with defense models, regardless of the original's type. This approach, while targeting adversarial robustness, compromises the enhancements developed in prior research to boost GNNs' practical performance. Here we introduce Grimm, the first plug-and-play defense model. With just a minimal interface requirement for extracting features from any layer of the protected GNNs, Grimm is thus enabled to seamlessly rectify perturbations. Specifically, we utilize the feature trajectories (FTs) generated by GNNs, as they evolve through epochs, to reflect the training status of the networks. We then theoretically prove that the FTs of victim nodes will inevitably exhibit discriminable anomalies. Consequently, inspired by the natural parallelism between the biological nervous and immune systems, we construct Grimm, a comprehensive artificial immune system for GNNs. Grimm not only detects abnormal FTs and rectifies adversarial edges during training but also operates efficiently in parallel, thereby mirroring the concurrent functionalities of its biological counterparts. We experimentally confirm that Grimm offers four empirically validated advantages: 1) Harmlessness, as it does not actively interfere with GNN training; 2) Parallelism, ensuring monitoring, detection, and rectification functions operate independently of the GNN training process; 3) Generalizability, demonstrating compatibility with mainstream GNNs such as GCN, GAT, and GraphSAGE; and 4) Transferability, as the detectors for abnormal FTs can be efficiently transferred across different systems for one-step rectification.

Paper Structure

This paper contains 38 sections, 3 theorems, 63 equations, 13 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

Consider a GNN undergoing a poisoning attack. Let $\mathcal{V}_{adv}$ and $\mathcal{V}_{non}$ respectively denote the sets of nodes with categories that have been compromised and those that remain uncompromised, The following classification function exists for all layer $\ell$: That is, FTs of attacked and non-attacked nodes are discriminable.

Figures (13)

  • Figure 1: The general workflow of Grimm.
  • Figure 2: FTs of attacked vs. non-attacked nodes.
  • Figure 3: The general workflow of the trajectory generator.
  • Figure 4: The detection method for adversarial edges.
  • Figure 5: Training time under increased sample ratio.
  • ...and 8 more figures

Theorems & Definitions (12)

  • Theorem 1
  • Proposition 1
  • Proposition 2
  • proof
  • proof
  • proof
  • proof
  • Definition 1
  • proof
  • proof
  • ...and 2 more