Spectral Graph Clustering under Differential Privacy: Balancing Privacy, Accuracy, and Efficiency
Mohamed Seif, Antti Koskela, H. Vincent Poor, Andrea J. Goldsmith
TL;DR
This work addresses spectral clustering under edge differential privacy by introducing three DP mechanisms that preserve spectral structure: matrix shuffling (RR on the adjacency plus permutation-based amplification), a projection-based Gaussian mechanism (lower-dimensional, memory-efficient embedding with DP guarantees), and a noised power method (iterative DP embedding). Each method provides explicit misclassification bounds tied to eigengaps and embedding margins, highlighting how privacy budgets, graph structure, and projection choices influence accuracy. The matrix shuffling approach yields the strongest utility at a given privacy level, while the projection method offers memory and computation savings, and the noisy power method balances privacy and efficiency in dense graphs with favorable eigengap scaling. Experimental results on SBM, Facebook Circles, and Cora validate the theoretical trade-offs and demonstrate practical privacy-utility- efficiency considerations for large graphs.
Abstract
We study the problem of spectral graph clustering under edge differential privacy (DP). Specifically, we develop three mechanisms: (i) graph perturbation via randomized edge flipping combined with adjacency matrix shuffling, which enforces edge privacy while preserving key spectral properties of the graph. Importantly, shuffling considerably amplifies the guarantees: whereas flipping edges with a fixed probability alone provides only a constant epsilon edge DP guarantee as the number of nodes grows, the shuffled mechanism achieves (epsilon, delta) edge DP with parameters that tend to zero as the number of nodes increase; (ii) private graph projection with additive Gaussian noise in a lower-dimensional space to reduce dimensionality and computational complexity; and (iii) a noisy power iteration method that distributes Gaussian noise across iterations to ensure edge DP while maintaining convergence. Our analysis provides rigorous privacy guarantees and a precise characterization of the misclassification error rate. Experiments on synthetic and real-world networks validate our theoretical analysis and illustrate the practical privacy-utility trade-offs.
