Table of Contents
Fetching ...

Provably Robust Explainable Graph Neural Networks against Graph Perturbation Attacks

Jiate Li, Meng Pang, Yun Dong, Jinyuan Jia, Binghui Wang

TL;DR

This work addresses the fragility of explainable graph neural networks (XGNNs) under graph perturbations by introducing XGNNCert, a certifiably robust XGNN. It constructs multiple hybrid subgraphs by hashing edge indices and combining portions of the test graph with edges from the complete graph, then aggregates predictions and explanations via a majority-vote classifier and a majority-vote explainer to obtain deterministic robustness guarantees. The approach yields a bound on the perturbation budget $M_{\lambda}$ under which the predicted label remains unchanged and at least $\lambda$ groundtruth explanation edges are preserved, with empirical results showing competitive explanation/prediction accuracy on clean data and strong robustness against adversarial perturbations across multiple datasets and explainers. The work highlights a practical robustness framework for safety-critical applications where reliable explanations are essential, and points to future improvements in subgraph design and permutation-invariant guarantees.

Abstract

Explaining Graph Neural Network (XGNN) has gained growing attention to facilitate the trust of using GNNs, which is the mainstream method to learn graph data. Despite their growing attention, Existing XGNNs focus on improving the explanation performance, and its robustness under attacks is largely unexplored. We noticed that an adversary can slightly perturb the graph structure such that the explanation result of XGNNs is largely changed. Such vulnerability of XGNNs could cause serious issues particularly in safety/security-critical applications. In this paper, we take the first step to study the robustness of XGNN against graph perturbation attacks, and propose XGNNCert, the first provably robust XGNN. Particularly, our XGNNCert can provably ensure the explanation result for a graph under the worst-case graph perturbation attack is close to that without the attack, while not affecting the GNN prediction, when the number of perturbed edges is bounded. Evaluation results on multiple graph datasets and GNN explainers show the effectiveness of XGNNCert.

Provably Robust Explainable Graph Neural Networks against Graph Perturbation Attacks

TL;DR

This work addresses the fragility of explainable graph neural networks (XGNNs) under graph perturbations by introducing XGNNCert, a certifiably robust XGNN. It constructs multiple hybrid subgraphs by hashing edge indices and combining portions of the test graph with edges from the complete graph, then aggregates predictions and explanations via a majority-vote classifier and a majority-vote explainer to obtain deterministic robustness guarantees. The approach yields a bound on the perturbation budget under which the predicted label remains unchanged and at least groundtruth explanation edges are preserved, with empirical results showing competitive explanation/prediction accuracy on clean data and strong robustness against adversarial perturbations across multiple datasets and explainers. The work highlights a practical robustness framework for safety-critical applications where reliable explanations are essential, and points to future improvements in subgraph design and permutation-invariant guarantees.

Abstract

Explaining Graph Neural Network (XGNN) has gained growing attention to facilitate the trust of using GNNs, which is the mainstream method to learn graph data. Despite their growing attention, Existing XGNNs focus on improving the explanation performance, and its robustness under attacks is largely unexplored. We noticed that an adversary can slightly perturb the graph structure such that the explanation result of XGNNs is largely changed. Such vulnerability of XGNNs could cause serious issues particularly in safety/security-critical applications. In this paper, we take the first step to study the robustness of XGNN against graph perturbation attacks, and propose XGNNCert, the first provably robust XGNN. Particularly, our XGNNCert can provably ensure the explanation result for a graph under the worst-case graph perturbation attack is close to that without the attack, while not affecting the GNN prediction, when the number of perturbed edges is bounded. Evaluation results on multiple graph datasets and GNN explainers show the effectiveness of XGNNCert.

Paper Structure

This paper contains 23 sections, 2 theorems, 10 equations, 12 figures, 9 tables, 1 algorithm.

Key Result

Theorem 2

For any two graphs $G=(\mathcal{V},\mathcal{E})$, $\hat{G}=(\mathcal{V}, \hat{\mathcal{E}})$ satisfying $|\mathcal{E}\setminus \hat{\mathcal{E}}| = M$. The corresponding hybrid subgraphs generated using the above strategy are denoted as $\{ G_H^i\}$ and $\{ \hat{G}_H^i\}$, respectively. Then $\{ G_H

Figures (12)

  • Figure 1: (a) GNN for graph classification and GNN explanation. A GNN classifier $f$ first predicts a label $y$ for the graph $G$, and then a GNN explainer $g$ interprets the predicted label $y$ to produce the explanatory edges $\mathcal{E}_k$. (b) Two possible graph perturbation attacks on the GNN explainer $g$: 1) the GNN prediction $\hat{y}$ on the perturbed graph $\hat{G}$ is different from $y$; 2) the GNN prediction on $\hat{G}$ is kept, but the explanatory edges $\hat{\mathcal{E}}_k$ outputted by $g$ after the attack is largely different from $\mathcal{E}_k$.
  • Figure 2: Overview of the proposed three-step certifiably robust XGNN.
  • Figure 3: Certified perturbation size over all testing graphs vs. $\lambda$ on PGExplainer. The maximum $\lambda$ in x-axis equals to $k$, the number of edges in the groundtruth explanation.
  • Figure 4: Certified perturbation size over all testing graphs vs. $p$ on PGExplainer.
  • Figure 5: Certified perturbation size over all testing graphs vs. $\gamma$ on PGExplainer.
  • ...and 7 more figures

Theorems & Definitions (5)

  • Definition 1: $(M_{\lambda},\lambda)$-Certifiably robust XGNN
  • Theorem 2: Bounded number of different subgraphs
  • proof
  • Theorem 3: Certified Perturbation Size $M_\lambda$ for a given $\lambda$
  • proof