Table of Contents
Fetching ...

Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

Sajani Vithana, Viveck R. Cadambe, Flavio P. Calmon, Haewon Jeong

TL;DR

CorDP-DME is proposed, a novel DP-DME mechanism based on the correlated Gaussian mechanism that spans the gap between DME with LDP and distributed DP, and it is proved that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion.

Abstract

Differentially private distributed mean estimation (DP-DME) is a fundamental building block in privacy-preserving federated learning, where a central server estimates the mean of $d$-dimensional vectors held by $n$ users while ensuring $(ε,δ)$-DP. Local differential privacy (LDP) and distributed DP with secure aggregation (SA) are the most common notions of DP used in DP-DME settings with an untrusted server. LDP provides strong resilience to dropouts, colluding users, and adversarial attacks, but suffers from poor utility. In contrast, SA-based DP-DME achieves an $O(n)$ utility gain over LDP in DME, but requires increased communication and computation overheads and complex multi-round protocols to handle dropouts and attacks. In this work, we present a generalized framework for DP-DME, that captures LDP and SA-based mechanisms as extreme cases. Our framework provides a foundation for developing and analyzing a variety of DP-DME protocols that leverage correlated privacy mechanisms across users. To this end, we propose CorDP-DME, a novel DP-DME mechanism based on the correlated Gaussian mechanism, that spans the gap between DME with LDP and distributed DP. We prove that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion. We provide an information-theoretic analysis of CorDP-DME, and derive theoretical guarantees for utility under any given privacy parameters and dropout/colluding user thresholds. Our results demonstrate that (anti) correlated Gaussian DP mechanisms can significantly improve utility in mean estimation tasks compared to LDP -- even in adversarial settings -- while maintaining better resilience to dropouts and attacks compared to distributed DP.

Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

TL;DR

CorDP-DME is proposed, a novel DP-DME mechanism based on the correlated Gaussian mechanism that spans the gap between DME with LDP and distributed DP, and it is proved that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion.

Abstract

Differentially private distributed mean estimation (DP-DME) is a fundamental building block in privacy-preserving federated learning, where a central server estimates the mean of -dimensional vectors held by users while ensuring -DP. Local differential privacy (LDP) and distributed DP with secure aggregation (SA) are the most common notions of DP used in DP-DME settings with an untrusted server. LDP provides strong resilience to dropouts, colluding users, and adversarial attacks, but suffers from poor utility. In contrast, SA-based DP-DME achieves an utility gain over LDP in DME, but requires increased communication and computation overheads and complex multi-round protocols to handle dropouts and attacks. In this work, we present a generalized framework for DP-DME, that captures LDP and SA-based mechanisms as extreme cases. Our framework provides a foundation for developing and analyzing a variety of DP-DME protocols that leverage correlated privacy mechanisms across users. To this end, we propose CorDP-DME, a novel DP-DME mechanism based on the correlated Gaussian mechanism, that spans the gap between DME with LDP and distributed DP. We prove that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion. We provide an information-theoretic analysis of CorDP-DME, and derive theoretical guarantees for utility under any given privacy parameters and dropout/colluding user thresholds. Our results demonstrate that (anti) correlated Gaussian DP mechanisms can significantly improve utility in mean estimation tasks compared to LDP -- even in adversarial settings -- while maintaining better resilience to dropouts and attacks compared to distributed DP.
Paper Structure (35 sections, 11 theorems, 158 equations, 6 figures, 2 tables)

This paper contains 35 sections, 11 theorems, 158 equations, 6 figures, 2 tables.

Key Result

Proposition 1

For any fixed $\mathcal{U}\subseteq\mathcal{U}_{all}$ satisfying $|\mathcal{U}|\geq t$, and for any $\sigma^2$, $\rho$, the optimum decoder is: The corresponding MSE is given by,

Figures (6)

  • Figure 1: MSE of LDP, CDP and CorDP-DME with different numbers of responding users for a setting with $n=100$, $\epsilon=2$, $\delta=10^{-5}$. CorDP-DME coincides with CDP when all users respond. All three mechanisms coincide when only one user responds. CorDP-DME always outperforms LDP. In general, CorDP-DME spans the gap between DME with LDP and CDP.
  • Figure 2: System model: Each user sends a perturbed vector $M(\mathbf{x}_i)$ and the central server decodes the mean through linear decoding, using the uploads of the responding users. We assume that there can be upto $c$ colluding users and $n-t$ dropouts. The server learns all the random variables observed by the colluding users.
  • Figure 3: Geometric interpretation of the privacy-utility trade-off in DP-DME for different correlation coefficients among the users' noise distributions: Noise vectors $\mathbf{Z}_1$ and $\mathbf{Z}_2$ are represented as vectors in $\mathcal{H}$ with magnitude $\sigma \sqrt{d}$ and angle $\theta=\cos^{-1}\rho$ between them. The privacy constraint on $\mathbf{x}_i$ enforces that the orthogonal component of $\mathbf{Z}_i$ relative to $\mathbf{Z}j$ (for $i \neq j$) has a magnitude bounded below by a constant $\gamma_{\epsilon, \delta}$. The MSE is proportional to the magnitude of $\mathbf{Z}_1+\mathbf{Z}_2$.
  • Figure 4: Variation of the MSE with changing noise parameters $\sigma^2$ and $\rho$: Increasing $\|\mathbf{Z}_i\|_{\mathcal{H}}=\sigma\sqrt{d}$ and $\theta=\cos^{-1}\rho$ while maintaining the orthogonal distance between $\mathbf{Z}_1$ and $\mathbf{Z}_2$ at $\gamma_{\epsilon,\delta}$ for privacy, decreases $\|\mathbf{Z}_1+\mathbf{Z}_2\|_{\mathcal{H}}$ (the MSE).
  • Figure 5: Overview of the offline phase of CorDP-DME in a three-user example: Each user $i$ sends secure information to other users $j$ (denoted by $\mathcal{I}_{i,j}$) to determine the common random seeds required for both users $i$ and $j$ to generate the same shared random variable $\mathbf{S}_{i,j}$.
  • ...and 1 more figures

Theorems & Definitions (25)

  • Definition 1: Generalized $(\epsilon,\delta)$-DP for DME
  • Definition 2: $(n,t,c,\epsilon,\delta)$-DP-DME scheme
  • Definition 3: MSE of an $(n,t,c,\epsilon,\delta)$-DP-DME scheme
  • Proposition 1: Optimum decoder
  • Theorem 1: Optimum noise distribution
  • Proposition 2: Bounds on $\sigma^2_{\epsilon,\delta}$
  • Corollary 1: Without collusion, Without dropouts
  • Corollary 2: Without collusion, with dropouts
  • Corollary 3: With collusion, without dropouts
  • Theorem 2: Unbiased mean estimate
  • ...and 15 more