Debiased Graph Poisoning Attack via Contrastive Surrogate Objective
Kanghoon Yoon, Yeonjun In, Namkyeong Lee, Kibum Kim, Chanyoung Park
TL;DR
The paper addresses the vulnerability of Graph Neural Networks to adversarial graph-structure perturbations and reveals a bias in existing meta-gradient attacks toward training nodes. It introduces Metacon, a meta-gradient attack that uses a contrastive surrogate objective to include unlabeled nodes in surrogate training, thereby expanding the attack surface to UU edges and reducing bias. Two concrete instantiations, Metacon-S (sample contrastive) and Metacon-D (dimension contrastive), pair with a cross-entropy goal to effectively degrade victim GNNs across datasets and models, including robust defenses. Experimental results across benchmark graphs show Metacon outperforms prior meta-gradient methods, demonstrating the practical impact of bias mitigation for graph poisoning attacks and suggesting new directions for defense research.
Abstract
Graph neural networks (GNN) are vulnerable to adversarial attacks, which aim to degrade the performance of GNNs through imperceptible changes on the graph. However, we find that in fact the prevalent meta-gradient-based attacks, which utilizes the gradient of the loss w.r.t the adjacency matrix, are biased towards training nodes. That is, their meta-gradient is determined by a training procedure of the surrogate model, which is solely trained on the training nodes. This bias manifests as an uneven perturbation, connecting two nodes when at least one of them is a labeled node, i.e., training node, while it is unlikely to connect two unlabeled nodes. However, these biased attack approaches are sub-optimal as they do not consider flipping edges between two unlabeled nodes at all. This means that they miss the potential attacked edges between unlabeled nodes that significantly alter the representation of a node. In this paper, we investigate the meta-gradients to uncover the root cause of the uneven perturbations of existing attacks. Based on our analysis, we propose a Meta-gradient-based attack method using contrastive surrogate objective (Metacon), which alleviates the bias in meta-gradient using a new surrogate loss. We conduct extensive experiments to show that Metacon outperforms existing meta gradient-based attack methods through benchmark datasets, while showing that alleviating the bias towards training nodes is effective in attacking the graph structure.
