Table of Contents
Fetching ...

Spectral Graph Clustering under Differential Privacy: Balancing Privacy, Accuracy, and Efficiency

Mohamed Seif, Antti Koskela, H. Vincent Poor, Andrea J. Goldsmith

TL;DR

This work addresses spectral clustering under edge differential privacy by introducing three DP mechanisms that preserve spectral structure: matrix shuffling (RR on the adjacency plus permutation-based amplification), a projection-based Gaussian mechanism (lower-dimensional, memory-efficient embedding with DP guarantees), and a noised power method (iterative DP embedding). Each method provides explicit misclassification bounds tied to eigengaps and embedding margins, highlighting how privacy budgets, graph structure, and projection choices influence accuracy. The matrix shuffling approach yields the strongest utility at a given privacy level, while the projection method offers memory and computation savings, and the noisy power method balances privacy and efficiency in dense graphs with favorable eigengap scaling. Experimental results on SBM, Facebook Circles, and Cora validate the theoretical trade-offs and demonstrate practical privacy-utility- efficiency considerations for large graphs.

Abstract

We study the problem of spectral graph clustering under edge differential privacy (DP). Specifically, we develop three mechanisms: (i) graph perturbation via randomized edge flipping combined with adjacency matrix shuffling, which enforces edge privacy while preserving key spectral properties of the graph. Importantly, shuffling considerably amplifies the guarantees: whereas flipping edges with a fixed probability alone provides only a constant epsilon edge DP guarantee as the number of nodes grows, the shuffled mechanism achieves (epsilon, delta) edge DP with parameters that tend to zero as the number of nodes increase; (ii) private graph projection with additive Gaussian noise in a lower-dimensional space to reduce dimensionality and computational complexity; and (iii) a noisy power iteration method that distributes Gaussian noise across iterations to ensure edge DP while maintaining convergence. Our analysis provides rigorous privacy guarantees and a precise characterization of the misclassification error rate. Experiments on synthetic and real-world networks validate our theoretical analysis and illustrate the practical privacy-utility trade-offs.

Spectral Graph Clustering under Differential Privacy: Balancing Privacy, Accuracy, and Efficiency

TL;DR

This work addresses spectral clustering under edge differential privacy by introducing three DP mechanisms that preserve spectral structure: matrix shuffling (RR on the adjacency plus permutation-based amplification), a projection-based Gaussian mechanism (lower-dimensional, memory-efficient embedding with DP guarantees), and a noised power method (iterative DP embedding). Each method provides explicit misclassification bounds tied to eigengaps and embedding margins, highlighting how privacy budgets, graph structure, and projection choices influence accuracy. The matrix shuffling approach yields the strongest utility at a given privacy level, while the projection method offers memory and computation savings, and the noisy power method balances privacy and efficiency in dense graphs with favorable eigengap scaling. Experimental results on SBM, Facebook Circles, and Cora validate the theoretical trade-offs and demonstrate practical privacy-utility- efficiency considerations for large graphs.

Abstract

We study the problem of spectral graph clustering under edge differential privacy (DP). Specifically, we develop three mechanisms: (i) graph perturbation via randomized edge flipping combined with adjacency matrix shuffling, which enforces edge privacy while preserving key spectral properties of the graph. Importantly, shuffling considerably amplifies the guarantees: whereas flipping edges with a fixed probability alone provides only a constant epsilon edge DP guarantee as the number of nodes grows, the shuffled mechanism achieves (epsilon, delta) edge DP with parameters that tend to zero as the number of nodes increase; (ii) private graph projection with additive Gaussian noise in a lower-dimensional space to reduce dimensionality and computational complexity; and (iii) a noisy power iteration method that distributes Gaussian noise across iterations to ensure edge DP while maintaining convergence. Our analysis provides rigorous privacy guarantees and a precise characterization of the misclassification error rate. Experiments on synthetic and real-world networks validate our theoretical analysis and illustrate the practical privacy-utility trade-offs.

Paper Structure

This paper contains 42 sections, 25 theorems, 131 equations, 4 figures, 2 tables, 2 algorithms.

Key Result

Lemma 2.1

For a given ${\varepsilon}\geq0$, tight $\delta({\varepsilon})$ is given by the expression $\delta({\varepsilon}) = \max_{G \sim G'} H_{e^{\varepsilon}}(\hat{\bm{\sigma}}(G')||\hat{\bm{\sigma}}(G))$, where $G \sim G'$ denotes that $G$ and $G'$ differ in one edge.

Figures (4)

  • Figure 1: Synopsis of results for the three different mechanisms: error rate vs. $\varepsilon$.
  • Figure 2: Left: Eigenvalues of the perturbed and adjacency matrix that is similarity transformed with random permutation similarity transformation. Right: the $({\varepsilon},\delta)$-DP guarantees of the perturbed and shuffled result, when the perturbed only result is $\varepsilon_0$-DP for $\varepsilon_0=2.2$.
  • Figure 3: Ablation on the number of iterations $N$ in the noisy power method across all four datasets. Each figure reports error rate as a function of $N$, averaged over 50 runs.
  • Figure 4: Ablation on the projection dimension $m$ for the matrix projection method. Results are reported for the two datasets where the mechanism is effective.

Theorems & Definitions (34)

  • Definition 1: $(\beta, \eta)$-Accurate Recovery
  • Definition 2: $(\varepsilon, \delta)$-edge DP
  • Lemma 2.1
  • Definition 3: Graph Perturbation Mechanism
  • Theorem 3.1
  • Corollary 3.1
  • Lemma 3.1: Matrix Representation for $\widetilde{\mathbf{A}}$
  • Lemma 3.2: Bounding the Spectral Norm of $\mathbf{Z}$
  • Lemma 3.3: Davis--Kahan
  • Lemma 3.4: Procrustes alignment
  • ...and 24 more