Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

Sajani Vithana; Viveck R. Cadambe; Flavio P. Calmon; Haewon Jeong

Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

Sajani Vithana, Viveck R. Cadambe, Flavio P. Calmon, Haewon Jeong

TL;DR

CorDP-DME is proposed, a novel DP-DME mechanism based on the correlated Gaussian mechanism that spans the gap between DME with LDP and distributed DP, and it is proved that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion.

Abstract

Differentially private distributed mean estimation (DP-DME) is a fundamental building block in privacy-preserving federated learning, where a central server estimates the mean of $d$-dimensional vectors held by $n$ users while ensuring $(ε,δ)$-DP. Local differential privacy (LDP) and distributed DP with secure aggregation (SA) are the most common notions of DP used in DP-DME settings with an untrusted server. LDP provides strong resilience to dropouts, colluding users, and adversarial attacks, but suffers from poor utility. In contrast, SA-based DP-DME achieves an $O(n)$ utility gain over LDP in DME, but requires increased communication and computation overheads and complex multi-round protocols to handle dropouts and attacks. In this work, we present a generalized framework for DP-DME, that captures LDP and SA-based mechanisms as extreme cases. Our framework provides a foundation for developing and analyzing a variety of DP-DME protocols that leverage correlated privacy mechanisms across users. To this end, we propose CorDP-DME, a novel DP-DME mechanism based on the correlated Gaussian mechanism, that spans the gap between DME with LDP and distributed DP. We prove that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion. We provide an information-theoretic analysis of CorDP-DME, and derive theoretical guarantees for utility under any given privacy parameters and dropout/colluding user thresholds. Our results demonstrate that (anti) correlated Gaussian DP mechanisms can significantly improve utility in mean estimation tasks compared to LDP -- even in adversarial settings -- while maintaining better resilience to dropouts and attacks compared to distributed DP.

Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

TL;DR

Abstract

Differentially private distributed mean estimation (DP-DME) is a fundamental building block in privacy-preserving federated learning, where a central server estimates the mean of

-dimensional vectors held by

users while ensuring

-DP. Local differential privacy (LDP) and distributed DP with secure aggregation (SA) are the most common notions of DP used in DP-DME settings with an untrusted server. LDP provides strong resilience to dropouts, colluding users, and adversarial attacks, but suffers from poor utility. In contrast, SA-based DP-DME achieves an

utility gain over LDP in DME, but requires increased communication and computation overheads and complex multi-round protocols to handle dropouts and attacks. In this work, we present a generalized framework for DP-DME, that captures LDP and SA-based mechanisms as extreme cases. Our framework provides a foundation for developing and analyzing a variety of DP-DME protocols that leverage correlated privacy mechanisms across users. To this end, we propose CorDP-DME, a novel DP-DME mechanism based on the correlated Gaussian mechanism, that spans the gap between DME with LDP and distributed DP. We prove that CorDP-DME offers a favorable balance between utility and resilience to dropout and collusion. We provide an information-theoretic analysis of CorDP-DME, and derive theoretical guarantees for utility under any given privacy parameters and dropout/colluding user thresholds. Our results demonstrate that (anti) correlated Gaussian DP mechanisms can significantly improve utility in mean estimation tasks compared to LDP -- even in adversarial settings -- while maintaining better resilience to dropouts and attacks compared to distributed DP.

Paper Structure (35 sections, 11 theorems, 158 equations, 6 figures, 2 tables)

This paper contains 35 sections, 11 theorems, 158 equations, 6 figures, 2 tables.

Introduction
Our Contributions
Related Work
DME with LDP
Correlated noise in DP-DME
Secure Aggregation (SA)
Problem Formulation
Correlated Gaussian Mechanism for DP-DME: A Geometric Interpretation
Proposed Approach: CorDP-DME Protocol
Main Results
Discussion on CorDP-DME
Example
Noise generation in CorDP-DME
Experiments
Conclusions and Limitations
...and 20 more sections

Key Result

Proposition 1

For any fixed $\mathcal{U}\subseteq\mathcal{U}_{all}$ satisfying $|\mathcal{U}|\geq t$, and for any $\sigma^2$, $\rho$, the optimum decoder is: The corresponding MSE is given by,

Figures (6)

Figure 1: MSE of LDP, CDP and CorDP-DME with different numbers of responding users for a setting with $n=100$, $\epsilon=2$, $\delta=10^{-5}$. CorDP-DME coincides with CDP when all users respond. All three mechanisms coincide when only one user responds. CorDP-DME always outperforms LDP. In general, CorDP-DME spans the gap between DME with LDP and CDP.
Figure 2: System model: Each user sends a perturbed vector $M(\mathbf{x}_i)$ and the central server decodes the mean through linear decoding, using the uploads of the responding users. We assume that there can be upto $c$ colluding users and $n-t$ dropouts. The server learns all the random variables observed by the colluding users.
Figure 3: Geometric interpretation of the privacy-utility trade-off in DP-DME for different correlation coefficients among the users' noise distributions: Noise vectors $\mathbf{Z}_1$ and $\mathbf{Z}_2$ are represented as vectors in $\mathcal{H}$ with magnitude $\sigma \sqrt{d}$ and angle $\theta=\cos^{-1}\rho$ between them. The privacy constraint on $\mathbf{x}_i$ enforces that the orthogonal component of $\mathbf{Z}_i$ relative to $\mathbf{Z}j$ (for $i \neq j$) has a magnitude bounded below by a constant $\gamma_{\epsilon, \delta}$. The MSE is proportional to the magnitude of $\mathbf{Z}_1+\mathbf{Z}_2$.
Figure 4: Variation of the MSE with changing noise parameters $\sigma^2$ and $\rho$: Increasing $\|\mathbf{Z}_i\|_{\mathcal{H}}=\sigma\sqrt{d}$ and $\theta=\cos^{-1}\rho$ while maintaining the orthogonal distance between $\mathbf{Z}_1$ and $\mathbf{Z}_2$ at $\gamma_{\epsilon,\delta}$ for privacy, decreases $\|\mathbf{Z}_1+\mathbf{Z}_2\|_{\mathcal{H}}$ (the MSE).
Figure 5: Overview of the offline phase of CorDP-DME in a three-user example: Each user $i$ sends secure information to other users $j$ (denoted by $\mathcal{I}_{i,j}$) to determine the common random seeds required for both users $i$ and $j$ to generate the same shared random variable $\mathbf{S}_{i,j}$.
...and 1 more figures

Theorems & Definitions (25)

Definition 1: Generalized $(\epsilon,\delta)$-DP for DME
Definition 2: $(n,t,c,\epsilon,\delta)$-DP-DME scheme
Definition 3: MSE of an $(n,t,c,\epsilon,\delta)$-DP-DME scheme
Proposition 1: Optimum decoder
Theorem 1: Optimum noise distribution
Proposition 2: Bounds on $\sigma^2_{\epsilon,\delta}$
Corollary 1: Without collusion, Without dropouts
Corollary 2: Without collusion, with dropouts
Corollary 3: With collusion, without dropouts
Theorem 2: Unbiased mean estimate
...and 15 more

Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

TL;DR

Abstract

Correlated Privacy Mechanisms for Differentially Private Distributed Mean Estimation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (25)