E-SAGE: Explainability-based Defense Against Backdoor Attacks on Graph Neural Networks

Dingqiang Yuan; Xiaohua Xu; Lei Yu; Tongchang Han; Rongchang Li; Meng Han

E-SAGE: Explainability-based Defense Against Backdoor Attacks on Graph Neural Networks

Dingqiang Yuan, Xiaohua Xu, Lei Yu, Tongchang Han, Rongchang Li, Meng Han

TL;DR

Backdoor attacks via subgraph insertion threaten GNNs in node classification. E-SAGE defends by leveraging explainability to identify and prune adversarial edges during prediction, using integrated gradients and neighbor sampling for efficiency. It supports multiple subgraph insertions and adversarial attacks and demonstrates strong ACC retention with reduced ASR across several datasets and models, with scalable runtime. This work offers a practical, explainability-driven defense for GNNs and motivates further study of explainability-tool interactions with model robustness.

Abstract

Graph Neural Networks (GNNs) have recently been widely adopted in multiple domains. Yet, they are notably vulnerable to adversarial and backdoor attacks. In particular, backdoor attacks based on subgraph insertion have been shown to be effective in graph classification tasks while being stealthy, successfully circumventing various existing defense methods. In this paper, we propose E-SAGE, a novel approach to defending GNN backdoor attacks based on explainability. We find that the malicious edges and benign edges have significant differences in the importance scores for explainability evaluation. Accordingly, E-SAGE adaptively applies an iterative edge pruning process on the graph based on the edge scores. Through extensive experiments, we demonstrate the effectiveness of E-SAGE against state-of-the-art graph backdoor attacks in different attack settings. In addition, we investigate the effectiveness of E-SAGE against adversarial attacks.

E-SAGE: Explainability-based Defense Against Backdoor Attacks on Graph Neural Networks

TL;DR

Abstract

Paper Structure (23 sections, 4 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 23 sections, 4 equations, 6 figures, 3 tables, 1 algorithm.

Introduction
Related Works
Graph Neural Networks
Graph Explainability and Integrated Gradients
Subgraph-based attack and defense methods
Subgraph-based attack
Subgraph-based defense
Threat Models
E-SAGE Defense
Defense overview
Attacker's goal and capabilities
Defender's goal and capabilities
Defense design
Integral gradient based explainer
Iteration
...and 8 more sections

Figures (6)

Figure 1: Backdoor attack(left) and adversarial attack(right) model
Figure 2: Score and Position Distribution of the Most Important Edges
Figure 3: Iteration of inserting 3 subgraphs in TDGIA attack
Figure 4: The effect of $\beta$ on ASR and ACC(Cora)
Figure 5: Evaluation of multi-subgraph insertion attacks
...and 1 more figures

E-SAGE: Explainability-based Defense Against Backdoor Attacks on Graph Neural Networks

TL;DR

Abstract

E-SAGE: Explainability-based Defense Against Backdoor Attacks on Graph Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (6)