Table of Contents
Fetching ...

How Feasible is Augmenting Fake Nodes with Learnable Features as a Counter-strategy against Link Stealing Attacks?

Mir Imtiaz Mostafiz, Imtiaz Karim, Elisa Bertino

TL;DR

This work tackles privacy risks in GNN-based predictions by addressing link-stealing attacks through a novel defense, $\mathsf{NARGIS}$. It introduces node augmentation with learnable features, guided by spectral clustering, to perturb the embedding space while preserving task utility, implemented via tri-level optimization involving a target GNN, a surrogate attacker, and the augmentation parameters. Empirical results on Cora, Citeseer, and PubMed show $\mathsf{NARGIS}$ achieving favorable fidelity-privacy trade-offs in many settings, particularly for GraphSAGE, and reveal limitations against certain attack types like LinkTeller and some GNN architectures. The study outlines future directions for theoretical analysis of loss-tuning and integration with node-influence-aware strategies to broaden robustness across diverse graph models and attacks.

Abstract

Graph Neural Networks (GNNs) are widely used and deployed for graph-based prediction tasks. However, as good as GNNs are for learning graph data, they also come with the risk of privacy leakage. For instance, an attacker can run carefully crafted queries on the GNNs and, from the responses, can infer the existence of an edge between a pair of nodes. This attack, dubbed as a "link-stealing" attack, can jeopardize the user's privacy by leaking potentially sensitive information. To protect against this attack, we propose an approach called "$(N)$ode $(A)$ugmentation for $(R)$estricting $(G)$raphs from $(I)$nsinuating their $(S)$tructure" ($NARGIS$) and study its feasibility. $NARGIS$ is focused on reshaping the graph embedding space so that the posterior from the GNN model will still provide utility for the prediction task but will introduce ambiguity for the link-stealing attackers. To this end, $NARGIS$ applies spectral clustering on the given graph to facilitate it being augmented with new nodes -- that have learned features instead of fixed ones. It utilizes tri-level optimization for learning parameters for the GNN model, surrogate attacker model, and our defense model (i.e. learnable node features). We extensively evaluate $NARGIS$ on three benchmark citation datasets over eight knowledge availability settings for the attackers. We also evaluate the model fidelity and defense performance on influence-based link inference attacks. Through our studies, we have figured out the best feature of $NARGIS$ -- its superior fidelity-privacy performance trade-off in a significant number of cases. We also have discovered in which cases the model needs to be improved, and proposed ways to integrate different schemes to make the model more robust against link stealing attacks.

How Feasible is Augmenting Fake Nodes with Learnable Features as a Counter-strategy against Link Stealing Attacks?

TL;DR

This work tackles privacy risks in GNN-based predictions by addressing link-stealing attacks through a novel defense, . It introduces node augmentation with learnable features, guided by spectral clustering, to perturb the embedding space while preserving task utility, implemented via tri-level optimization involving a target GNN, a surrogate attacker, and the augmentation parameters. Empirical results on Cora, Citeseer, and PubMed show achieving favorable fidelity-privacy trade-offs in many settings, particularly for GraphSAGE, and reveal limitations against certain attack types like LinkTeller and some GNN architectures. The study outlines future directions for theoretical analysis of loss-tuning and integration with node-influence-aware strategies to broaden robustness across diverse graph models and attacks.

Abstract

Graph Neural Networks (GNNs) are widely used and deployed for graph-based prediction tasks. However, as good as GNNs are for learning graph data, they also come with the risk of privacy leakage. For instance, an attacker can run carefully crafted queries on the GNNs and, from the responses, can infer the existence of an edge between a pair of nodes. This attack, dubbed as a "link-stealing" attack, can jeopardize the user's privacy by leaking potentially sensitive information. To protect against this attack, we propose an approach called "ode ugmentation for estricting raphs from nsinuating their tructure" () and study its feasibility. is focused on reshaping the graph embedding space so that the posterior from the GNN model will still provide utility for the prediction task but will introduce ambiguity for the link-stealing attackers. To this end, applies spectral clustering on the given graph to facilitate it being augmented with new nodes -- that have learned features instead of fixed ones. It utilizes tri-level optimization for learning parameters for the GNN model, surrogate attacker model, and our defense model (i.e. learnable node features). We extensively evaluate on three benchmark citation datasets over eight knowledge availability settings for the attackers. We also evaluate the model fidelity and defense performance on influence-based link inference attacks. Through our studies, we have figured out the best feature of -- its superior fidelity-privacy performance trade-off in a significant number of cases. We also have discovered in which cases the model needs to be improved, and proposed ways to integrate different schemes to make the model more robust against link stealing attacks.

Paper Structure

This paper contains 49 sections, 1 theorem, 14 equations, 2 figures, 4 tables, 3 algorithms.

Key Result

Proposition 1

Let the Spectral Clustering Algorithm von2007tutorial be applied on an unweighted graph $G=(V,E)$ in such a way that (i) every cluster is equally (approximately) sized in terms of the number of nodes, (ii) each node in a cluster is within the $L-$neighborhood of other nodes in the same cluster for a

Figures (2)

  • Figure 1: Illustration of Posterior Perturbation in Posterior Simplex Space for Protection against Link Stealing Attacks
  • Figure 2: System Overview of $\mathsf{NARGIS}$

Theorems & Definitions (1)

  • Proposition 1