Table of Contents
Fetching ...

On provable privacy vulnerabilities of graph representations

Ruofan Wu, Guanhua Fang, Qiying Pan, Mingyang Zhang, Tengfei Liu, Weiqiang Wang

TL;DR

This research primarily addresses the theoretical underpinnings of similarity-based edge reconstruction attacks (SERA), furnishing a non-asymptotic analysis of their reconstruction capacities and presents empirical corroboration indicating that such attacks can perfectly reconstruct sparse graphs as graph size increases.

Abstract

Graph representation learning (GRL) is critical for extracting insights from complex network structures, but it also raises security concerns due to potential privacy vulnerabilities in these representations. This paper investigates the structural vulnerabilities in graph neural models where sensitive topological information can be inferred through edge reconstruction attacks. Our research primarily addresses the theoretical underpinnings of similarity-based edge reconstruction attacks (SERA), furnishing a non-asymptotic analysis of their reconstruction capacities. Moreover, we present empirical corroboration indicating that such attacks can perfectly reconstruct sparse graphs as graph size increases. Conversely, we establish that sparsity is a critical factor for SERA's effectiveness, as demonstrated through analysis and experiments on (dense) stochastic block models. Finally, we explore the resilience of private graph representations produced via noisy aggregation (NAG) mechanism against SERA. Through theoretical analysis and empirical assessments, we affirm the mitigation of SERA using NAG . In parallel, we also empirically delineate instances wherein SERA demonstrates both efficacy and deficiency in its capacity to function as an instrument for elucidating the trade-off between privacy and utility.

On provable privacy vulnerabilities of graph representations

TL;DR

This research primarily addresses the theoretical underpinnings of similarity-based edge reconstruction attacks (SERA), furnishing a non-asymptotic analysis of their reconstruction capacities and presents empirical corroboration indicating that such attacks can perfectly reconstruct sparse graphs as graph size increases.

Abstract

Graph representation learning (GRL) is critical for extracting insights from complex network structures, but it also raises security concerns due to potential privacy vulnerabilities in these representations. This paper investigates the structural vulnerabilities in graph neural models where sensitive topological information can be inferred through edge reconstruction attacks. Our research primarily addresses the theoretical underpinnings of similarity-based edge reconstruction attacks (SERA), furnishing a non-asymptotic analysis of their reconstruction capacities. Moreover, we present empirical corroboration indicating that such attacks can perfectly reconstruct sparse graphs as graph size increases. Conversely, we establish that sparsity is a critical factor for SERA's effectiveness, as demonstrated through analysis and experiments on (dense) stochastic block models. Finally, we explore the resilience of private graph representations produced via noisy aggregation (NAG) mechanism against SERA. Through theoretical analysis and empirical assessments, we affirm the mitigation of SERA using NAG . In parallel, we also empirically delineate instances wherein SERA demonstrates both efficacy and deficiency in its capacity to function as an instrument for elucidating the trade-off between privacy and utility.
Paper Structure (50 sections, 12 theorems, 56 equations, 16 figures, 3 tables)

This paper contains 50 sections, 12 theorems, 56 equations, 16 figures, 3 tables.

Key Result

Theorem 4.1

Let $C_1, C_2$ be a universal constants. Assume the following: Then there exists a threshold $\tau = \Theta\left(\frac{1}{(C_2\log n)^{2L}}\right)$ such that with probability at least $1 - \frac{2}{n^2}$, the following holds for SERA with the similarity measure chosen either as cos or corr: Consequently, on the above set of events we have $\textsf{AUROC}\xspace_{\widehat{A}} \ge 1 - \frac{(C_2\l

Figures (16)

  • Figure 1: Attacking efficacy of SERA over sparse Erdős–Rényi graphs and dense SBM graphs, with performance measured in AUROC metric averaged over $5$ random trials for each configuration.
  • Figure 2: Privacy and utility assessments on the Cora dataset with underlying model of NAG being GCN and GAT. The first row contains attack performances of SERA measured using AUROC metric under both constrained and unconstrained training scheme. The second row presents corresponding model performances.
  • Figure 3: Illustration of a typical vertically federated graph representation learning scenario, the figure is adapted from wu2023privacy.
  • Figure 4: Attacking efficacy of SERA over sparse Erdős–Rényi graphs, with each grid's value indicating SERA's performance measured in either AUROC(first row) or ERR(second row) metric.
  • Figure 5: Attacking efficacy of SERA over sparse Erdős–Rényi graphs, with each grid's value indicating SERA's performance measured in either AUROC(first row) or ERR(second row) metric.
  • ...and 11 more figures

Theorems & Definitions (20)

  • Theorem 4.1
  • Remark 4.2: Practicality
  • Theorem 5.1
  • Remark 5.2
  • Theorem 6.1
  • Remark 6.2: Alternative defenses
  • Remark 6.3: Impact of depth $L$
  • Lemma C.1
  • Lemma C.2
  • proof : Proof of Lemma \ref{['lem:size:2']}
  • ...and 10 more