Table of Contents
Fetching ...

Black-box Gradient Attack on Graph Neural Networks: Deeper Insights in Graph-based Attack and Defense

Haoxi Zhan, Xiaobing Pei

TL;DR

The Black-Box Gradient Attack (BBGA) algorithm is proposed, which is able to achieve stable attack performance without accessing the training sets of the GNNs and is applicable when attacking against various defense methods.

Abstract

Graph Neural Networks (GNNs) have received significant attention due to their state-of-the-art performance on various graph representation learning tasks. However, recent studies reveal that GNNs are vulnerable to adversarial attacks, i.e. an attacker is able to fool the GNNs by perturbing the graph structure or node features deliberately. While being able to successfully decrease the performance of GNNs, most existing attacking algorithms require access to either the model parameters or the training data, which is not practical in the real world. In this paper, we develop deeper insights into the Mettack algorithm, which is a representative grey-box attacking method, and then we propose a gradient-based black-box attacking algorithm. Firstly, we show that the Mettack algorithm will perturb the edges unevenly, thus the attack will be highly dependent on a specific training set. As a result, a simple yet useful strategy to defense against Mettack is to train the GNN with the validation set. Secondly, to overcome the drawbacks, we propose the Black-Box Gradient Attack (BBGA) algorithm. Extensive experiments demonstrate that out proposed method is able to achieve stable attack performance without accessing the training sets of the GNNs. Further results shows that our proposed method is also applicable when attacking against various defense methods.

Black-box Gradient Attack on Graph Neural Networks: Deeper Insights in Graph-based Attack and Defense

TL;DR

The Black-Box Gradient Attack (BBGA) algorithm is proposed, which is able to achieve stable attack performance without accessing the training sets of the GNNs and is applicable when attacking against various defense methods.

Abstract

Graph Neural Networks (GNNs) have received significant attention due to their state-of-the-art performance on various graph representation learning tasks. However, recent studies reveal that GNNs are vulnerable to adversarial attacks, i.e. an attacker is able to fool the GNNs by perturbing the graph structure or node features deliberately. While being able to successfully decrease the performance of GNNs, most existing attacking algorithms require access to either the model parameters or the training data, which is not practical in the real world. In this paper, we develop deeper insights into the Mettack algorithm, which is a representative grey-box attacking method, and then we propose a gradient-based black-box attacking algorithm. Firstly, we show that the Mettack algorithm will perturb the edges unevenly, thus the attack will be highly dependent on a specific training set. As a result, a simple yet useful strategy to defense against Mettack is to train the GNN with the validation set. Secondly, to overcome the drawbacks, we propose the Black-Box Gradient Attack (BBGA) algorithm. Extensive experiments demonstrate that out proposed method is able to achieve stable attack performance without accessing the training sets of the GNNs. Further results shows that our proposed method is also applicable when attacking against various defense methods.

Paper Structure

This paper contains 24 sections, 1 theorem, 13 equations, 8 figures, 3 tables, 1 algorithm.

Key Result

theorem 1

If all paths in the computation graph of the model are activated with the same probability. Given a L-layer GCN with Eq. sage as the normalization method, then $\forall i,j \in V$, $\mathbb{E} [\frac{\partial H_j}{\partial X_i}]$ is equivalent to the probability of reaching node $j$ via a k-step ran

Figures (8)

  • Figure 1: The local perturbations rates of mettack perturbed graphs.
  • Figure 2: The testing accuracies of GCN on DICE-attacked graphs.
  • Figure 3: Misclassification rates when attacking against GCN-Jaccard, (a) Cora, (b) Citeseer, (c) Cora-ML.
  • Figure 4: Misclassification rates when attacking against R-GCN, (a) Cora, (b) Citeseer, (c) Cora-ML.
  • Figure 5: Misclassification rates when attacking against Pro-GNN, (a) Cora, (b) Citeseer, (c) Cora-ML.
  • ...and 3 more figures

Theorems & Definitions (1)

  • theorem 1: JK