PriME: Privacy-aware Membership profile Estimation in networks

Abhinav Chakraborty; Sayak Chatterjee; Sagnik Nandy

PriME: Privacy-aware Membership profile Estimation in networks

Abhinav Chakraborty, Sayak Chatterjee, Sagnik Nandy

TL;DR

This work tackles private estimation of membership profiles in networks generated by the Degree-Corrected Mixed Membership Stochastic Block Model (DCMM) under $\varepsilon$-edge Local Differential Privacy. It introduces PriME, a privacy-preserving spectral method that first privatizes the observed graph via a symmetric edge-flip mechanism and then recovers the membership matrix $\Pi$ using post-PCA SCORE normalization and a Sketched Vertex Hunting step. The authors establish a minimax lower bound on estimation risk under privacy and prove that PriME achieves the corresponding rate upper bound, up to logarithmic factors, demonstrating minimax optimality under edge-LDP. Empirical results on synthetic data and real networks (Facebook Ego Networks and Political Blogs) illustrate the privacy-utility trade-off and the method’s practical viability, while discussing how degree heterogeneity and signal strength modulate performance. The work advances privacy-aware network analysis by providing the first minimax-rate results for private estimation in MMSBMs with mixed memberships and a concrete algorithm to achieve them.

Abstract

This paper presents a novel approach to estimating community membership probabilities for network vertices generated by the Degree Corrected Mixed Membership Stochastic Block Model while preserving individual edge privacy. Operating within the $\varepsilon$-edge local differential privacy framework, we introduce an optimal private algorithm based on a symmetric edge flip mechanism and spectral clustering for accurate estimation of vertex community memberships. We conduct a comprehensive analysis of the estimation risk and establish the optimality of our procedure by providing matching lower bounds to the minimax risk under privacy constraints. To validate our approach, we demonstrate its performance through numerical simulations and its practical application to real-world data. This work represents a significant step forward in balancing accurate community membership estimation with stringent privacy preservation in network data analysis.

PriME: Privacy-aware Membership profile Estimation in networks

TL;DR

This work tackles private estimation of membership profiles in networks generated by the Degree-Corrected Mixed Membership Stochastic Block Model (DCMM) under

-edge Local Differential Privacy. It introduces PriME, a privacy-preserving spectral method that first privatizes the observed graph via a symmetric edge-flip mechanism and then recovers the membership matrix

using post-PCA SCORE normalization and a Sketched Vertex Hunting step. The authors establish a minimax lower bound on estimation risk under privacy and prove that PriME achieves the corresponding rate upper bound, up to logarithmic factors, demonstrating minimax optimality under edge-LDP. Empirical results on synthetic data and real networks (Facebook Ego Networks and Political Blogs) illustrate the privacy-utility trade-off and the method’s practical viability, while discussing how degree heterogeneity and signal strength modulate performance. The work advances privacy-aware network analysis by providing the first minimax-rate results for private estimation in MMSBMs with mixed memberships and a concrete algorithm to achieve them.

Abstract

-edge local differential privacy framework, we introduce an optimal private algorithm based on a symmetric edge flip mechanism and spectral clustering for accurate estimation of vertex community memberships. We conduct a comprehensive analysis of the estimation risk and establish the optimality of our procedure by providing matching lower bounds to the minimax risk under privacy constraints. To validate our approach, we demonstrate its performance through numerical simulations and its practical application to real-world data. This work represents a significant step forward in balancing accurate community membership estimation with stringent privacy preservation in network data analysis.

Paper Structure (30 sections, 11 theorems, 162 equations, 5 figures, 1 algorithm)

This paper contains 30 sections, 11 theorems, 162 equations, 5 figures, 1 algorithm.

Introduction
Related Literature and our contributions
Paper Organization
Notations and Assumptions
Balancing Privacy and Statistical Utility
Local Differential Privacy and Problem Statement
Minimax Lower Bound to the Estimation Risk under Privacy Constraint
PriME: Optimal Mechanism for privatizing the data and Membership Profile Estimation
Edge Flip Mechanism to enforce Local Differential Privacy
Estimation of the Community Membership Profiles
Analysis of Estimation risk of PriME
Proof of Theorem \ref{['thm:upper_bound_informal']}
Proof of Theorem \ref{['thm:upper_bound_informal']}
Experiments
Real Data Examples
...and 15 more sections

Key Result

Theorem 2.1

Given constants $\rho\in(0,1)$ and $a_0\in(0,1),$ consider $\theta \in \mathcal{G}_n(\rho,a_0)$ such that $\mathrm{err}_n\to0$. Furthermore, consider an $\varepsilon \in [0,\varepsilon_0]$ for some $\varepsilon_0>0$ and $\theta_i\in[0,\,C]$ for all $1\le i\le n$ for some constant $C>0$. Then there e where $F_n$ is the empirical distribution function of $\{\theta_1/\widetilde{\theta},\,\theta_2/\wi

Figures (5)

Figure 1: Sample Complexity as a function of estimation error $\alpha$ and privacy budget $\varepsilon$
Figure 2: Plot of average loss $\mathcal{L}(\widehat{\Pi},\Pi)$ over 100 independent replications as functions of $\widetilde{\theta}$ and $\varepsilon$.
Figure 3: Facebook ego network with different communities. Left panel: Communities estimated by MIXED-SCORE from ke2022optimal. Right panel: Communities estimated by PriME with $\varepsilon = 8$.
Figure 4: The political blog network and political affiliations of the blogs. Left panel: Communities estimated by MIXED-SCORE from ke2022optimal. Right panel: Communities estimated by PriME with $\varepsilon = 1.5$. Out of $1222$ political blogs, 33 blogs had different group memberships in private and non-private cases.
Figure 5: Plot of $\mathcal{L}(\widehat{\Pi}_\infty,\,\widehat{\Pi}_\varepsilon)$ as a function of $\varepsilon$ in political blog data.

Theorems & Definitions (26)

Remark 1.1
Definition 2.1: $\varepsilon$-edge LDP
Definition 2.2
Remark 2.1
Remark 2.2
Theorem 2.1
Remark 2.3
Definition 3.1: Symmetric Edge-Flip Mechanism
Theorem 3.1
Remark 3.1
...and 16 more

PriME: Privacy-aware Membership profile Estimation in networks

TL;DR

Abstract

PriME: Privacy-aware Membership profile Estimation in networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (26)