Table of Contents
Fetching ...

Differentially Private Federated Learning without Noise Addition: When is it Possible?

Jiang Zhang, Konstantinos Psounis, Salman Avestimehr

TL;DR

This work formally identifies the necessary condition that SA can provide DP without addition noise and proves that when the randomness inside the aggregated model update is Gaussian with non-singular covariance matrix, SA can provide differential privacy guarantees with the level of privacy bounded by the reciprocal of the minimum eigenvalue of the covariance matrix.

Abstract

Federated Learning (FL) with Secure Aggregation (SA) has gained significant attention as a privacy preserving framework for training machine learning models while preventing the server from learning information about users' data from their individual encrypted model updates. Recent research has extended privacy guarantees of FL with SA by bounding the information leakage through the aggregate model over multiple training rounds thanks to leveraging the "noise" from other users' updates. However, the privacy metric used in that work (mutual information) measures the on-average privacy leakage, without providing any privacy guarantees for worse-case scenarios. To address this, in this work we study the conditions under which FL with SA can provide worst-case differential privacy guarantees. Specifically, we formally identify the necessary condition that SA can provide DP without addition noise. We then prove that when the randomness inside the aggregated model update is Gaussian with non-singular covariance matrix, SA can provide differential privacy guarantees with the level of privacy $ε$ bounded by the reciprocal of the minimum eigenvalue of the covariance matrix. However, we further demonstrate that in practice, these conditions are almost unlikely to hold and hence additional noise added in model updates is still required in order for SA in FL to achieve DP. Lastly, we discuss the potential solution of leveraging inherent randomness inside aggregated model update to reduce the amount of addition noise required for DP guarantee.

Differentially Private Federated Learning without Noise Addition: When is it Possible?

TL;DR

This work formally identifies the necessary condition that SA can provide DP without addition noise and proves that when the randomness inside the aggregated model update is Gaussian with non-singular covariance matrix, SA can provide differential privacy guarantees with the level of privacy bounded by the reciprocal of the minimum eigenvalue of the covariance matrix.

Abstract

Federated Learning (FL) with Secure Aggregation (SA) has gained significant attention as a privacy preserving framework for training machine learning models while preventing the server from learning information about users' data from their individual encrypted model updates. Recent research has extended privacy guarantees of FL with SA by bounding the information leakage through the aggregate model over multiple training rounds thanks to leveraging the "noise" from other users' updates. However, the privacy metric used in that work (mutual information) measures the on-average privacy leakage, without providing any privacy guarantees for worse-case scenarios. To address this, in this work we study the conditions under which FL with SA can provide worst-case differential privacy guarantees. Specifically, we formally identify the necessary condition that SA can provide DP without addition noise. We then prove that when the randomness inside the aggregated model update is Gaussian with non-singular covariance matrix, SA can provide differential privacy guarantees with the level of privacy bounded by the reciprocal of the minimum eigenvalue of the covariance matrix. However, we further demonstrate that in practice, these conditions are almost unlikely to hold and hence additional noise added in model updates is still required in order for SA in FL to achieve DP. Lastly, we discuss the potential solution of leveraging inherent randomness inside aggregated model update to reduce the amount of addition noise required for DP guarantee.
Paper Structure (22 sections, 6 theorems, 21 equations, 5 figures, 1 table)

This paper contains 22 sections, 6 theorems, 21 equations, 5 figures, 1 table.

Key Result

Lemma 1

A mechanism that satisfies $(\alpha, \epsilon)$-RDP also satisfies $(\epsilon + \frac{\log(1/\delta)}{\alpha-1}, \delta)$-DP for any $\delta > 0$.

Figures (5)

  • Figure 1: Federated learning with SA and DP guarantees.
  • Figure 2: System model for FL with SA. Note that the input of this system is users' local datasets ($\{D_i\}_{i=1}^{i=N}$), and the output of the system is the aggregated model update ($\sum_{i=1}^{i=N}x_i^{(t)}$), which is a random vector due to users' local gradient (i.e. data batch) sampling. The server will infer user $i$'s local dataset ($D_i$) by observing $\sum_{i=1}^{i=N}x_i^{(t)}$.
  • Figure 3: Heatmap of the absolute values of sampled updates from users $1,2$ and $3$ in the counterexample. $x_4$ and $x_4'$ can be distinguished even adding the aggregated noise from $\sum_{i=1}^3 x_i$.
  • Figure 4: Comparison of WF noise and isotropic noise.
  • Figure 5: Comparison of different DP mechanisms on MNIST dataset. Note that we consider 50 users participating in FL. The training epoch is set as 100, the mini-batch size $B$ is 32, the clipped value $C$ is set as 10, and we consider $\delta=10^{-4}$. We report the accumulative privacy across all training epochs by using the composition theorem in kairouz2015composition.

Theorems & Definitions (9)

  • Definition 1: DP dwork2014algorithmic
  • Definition 2: Rényi Divergencegil2013renyi
  • Definition 3: $(\alpha, \epsilon)$-RDPmironov2017renyi
  • Lemma 1: From RDP to $(\epsilon,\delta)$-DPmironov2017renyi
  • Theorem 1: A necessary condition for DP guarantee
  • Lemma 2: Bounded maximal singular value
  • Theorem 2
  • Theorem 3
  • Theorem 4: DP guarantees of Gaussian sampling noise + WF-NA algorithm