Local Differential Privacy-Preserving Spectral Clustering for General Graphs

Sayan Mukherjee; Vorapong Suppakitpaisarn

Local Differential Privacy-Preserving Spectral Clustering for General Graphs

Sayan Mukherjee, Vorapong Suppakitpaisarn

TL;DR

This work addresses the robustness of spectral clustering under local differential privacy for general graphs by modeling privacy with edge flipping. It introduces a formal main theorem showing that, for flip probability $p = O(\log n / n)$, the spectral clustering output changes only by $O(\eta(G)\cdot n)$ in the $d_{\text{size}}$ metric, where $\eta(G)$ is the graph's spectral robustness. A key contribution is the spectral-robustness parameter and a tight analysis that yields a privacy-budget lower bound of $\Theta(\log n)$ in the worst case, complemented by an instability result for larger noise levels. Empirically, the authors validate their theory on real networks, demonstrating strong clustering stability under modest edge perturbations and outlining limits as privacy budgets grow larger.

Abstract

Spectral clustering is a widely used algorithm to find clusters in networks. Several researchers have studied the stability of spectral clustering under local differential privacy with the additional assumption that the underlying networks are generated from the stochastic block model (SBM). However, we argue that this assumption is too restrictive since social networks do not originate from the SBM. Thus, we delve into an analysis for general graphs in this work. Our primary focus is the edge flipping method -- a common technique for protecting local differential privacy. We show that, when the edges of an $n$-vertex graph satisfying some reasonable well-clustering assumptions are flipped with a probability of $O(\log n/n)$, the clustering outcomes are largely consistent. Empirical tests further corroborate these theoretical findings. Conversely, although clustering outcomes have been stable for non-sparse and well-clustered graphs produced from the SBM, we show that in general, spectral clustering may yield highly erratic results on certain well-clustered graphs when the flipping probability is $ω(\log n/n)$. This indicates that the best privacy budget obtainable for general graphs is $Θ(\log n)$.

Local Differential Privacy-Preserving Spectral Clustering for General Graphs

TL;DR

, the spectral clustering output changes only by

in the

metric, where

is the graph's spectral robustness. A key contribution is the spectral-robustness parameter and a tight analysis that yields a privacy-budget lower bound of

in the worst case, complemented by an instability result for larger noise levels. Empirically, the authors validate their theory on real networks, demonstrating strong clustering stability under modest edge perturbations and outlining limits as privacy budgets grow larger.

Abstract

-vertex graph satisfying some reasonable well-clustering assumptions are flipped with a probability of

, the clustering outcomes are largely consistent. Empirical tests further corroborate these theoretical findings. Conversely, although clustering outcomes have been stable for non-sparse and well-clustered graphs produced from the SBM, we show that in general, spectral clustering may yield highly erratic results on certain well-clustered graphs when the flipping probability is

. This indicates that the best privacy budget obtainable for general graphs is

Paper Structure (21 sections, 14 theorems, 34 equations, 5 figures)

This paper contains 21 sections, 14 theorems, 34 equations, 5 figures.

Introduction
Our Contribution
Preliminaries
Notation
Edge-subsets.
Cuts.
Spectral Graph Theory.
Edge Differential Privacy under Randomized Response
Spectral Clustering
Concentration Inequalities
Assumptions
Main Theorem
Proof Structure.
The term $d_{\text{size}}(S, S^\ast)$.
The term $d_{\text{size}}(S^\ast_F, S_F)$.
...and 6 more sections

Key Result

Theorem 1.1

Let $G'$ be obtained from $G$ via the edge flipping mechanism with probability $p=O(\log n/n)$. Then, under some reasonable assumptions, the number of vertices misclassified by the spectral clustering algorithm by running it on $G'$ instead of $G$ is $O(\eta(G)\cdot n)$ with probability $1-o(1)$, wh

Figures (5)

Figure 1.1: A part of the Facebook network detailed in leskovec2012learning before and after flipping edges with a probability of $0.005$. The neighborhood (colored light pink) of node $86$ changes a lot after the flipping.
Figure 4.1: The graph $G$ (left) and the graph $G\triangle F$ where $F \cong \mathcal{G}(n,p)$ (right). Dashed lines represent probabilistic edges between the parts $A$, $B$ and $C$.
Figure 5.1: (a): The social network Facebook1684 obtained from SNAP after pruning. Each node was assigned a color based on the spectral clustering outcomes. (b): We generated 100 graphs from the first graph Facebook0 (Figure \ref{['fig:JMLR-figure1']}), and plotted the worst discrepancy $d_{\text{size}}$ between the outputs of the spectral clustering of the original and perturbed graphs for these 100 random runs.
Figure 5.2: The robustness results of the social networks upon the introduction of a flipping probability that exceeds the value specified in Assumption \ref{['assumptions-preliminary']} (a) for Facebook0 network and (b) for Facebook1684 network.
Figure 5.3: The average $d_{\rm size}$ results of the social networks upon the introduction of a flipping probability that exceeds the value specified in Assumption \ref{['assumptions-preliminary']} for (a) the Facebook0 network and (b) the Facebook1684 network.

Theorems & Definitions (26)

Theorem 1.1: Informal version of Theorem \ref{['thm:mainthm-robustness']}
Remark 1.2
Remark 1.3
Definition 2.1: Spectral robustness
Definition 2.2: $\varepsilon$-edge differential privacy; nissim2007smooth
Theorem 2.3: wang2016using
Lemma 2.4
proof
Lemma 2.5: Cheeger's Inequality; Cheeger1971Laplacianalon1986eigenvalues
Lemma 2.6: Improved Cheeger Inequality; kwok2013improvedcheeger
...and 16 more

Local Differential Privacy-Preserving Spectral Clustering for General Graphs

TL;DR

Abstract

Local Differential Privacy-Preserving Spectral Clustering for General Graphs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (26)