Noise-Aware Algorithm for Heterogeneous Differentially Private Federated Learning

Saber Malekmohammadi; Yaoliang Yu; Yang Cao

Noise-Aware Algorithm for Heterogeneous Differentially Private Federated Learning

Saber Malekmohammadi, Yaoliang Yu, Yang Cao

TL;DR

Robust-HDP is proposed, which efficiently estimates the true noise level in clients model updates and reduces the noise-level in the aggregated model updates considerably and improves utility and convergence speed, while being safe to the clients that may maliciously send falsified privacy parameter to server.

Abstract

High utility and rigorous data privacy are of the main goals of a federated learning (FL) system, which learns a model from the data distributed among some clients. The latter has been tried to achieve by using differential privacy in FL (DPFL). There is often heterogeneity in clients privacy requirements, and existing DPFL works either assume uniform privacy requirements for clients or are not applicable when server is not fully trusted (our setting). Furthermore, there is often heterogeneity in batch and/or dataset size of clients, which as shown, results in extra variation in the DP noise level across clients model updates. With these sources of heterogeneity, straightforward aggregation strategies, e.g., assigning clients aggregation weights proportional to their privacy parameters will lead to lower utility. We propose Robust-HDP, which efficiently estimates the true noise level in clients model updates and reduces the noise-level in the aggregated model updates considerably. Robust-HDP improves utility and convergence speed, while being safe to the clients that may maliciously send falsified privacy parameter to server. Extensive experimental results on multiple datasets and our theoretical analysis confirm the effectiveness of Robust-HDP. Our code can be found here.

Noise-Aware Algorithm for Heterogeneous Differentially Private Federated Learning

TL;DR

Abstract

Paper Structure (54 sections, 7 theorems, 72 equations, 13 figures, 22 tables, 3 algorithms)

This paper contains 54 sections, 7 theorems, 72 equations, 13 figures, 22 tables, 3 algorithms.

Introduction
Related work
Differential privacy.
Heterogeneous DPFL.
The 0.90Robust-HDP algorithm for heterogeneous DPFL
Noise level in clients' DP batch gradients
1. Effective clipping threshold for all samples:
2. Ineffective clipping threshold for all samples:
Noise level in clients' DP model updates
Optimum aggregation strategy
Description of 0.90Robust-HDP algorithm
Reliability of 0.90Robust-HDP
Scalability of 0.90Robust-HDP with the number of model parameters Lg
Privacy analysis of 0.90Robust-HDP
The optimization side of 0.90Robust-HDP
...and 39 more sections

Key Result

Theorem 3.1

For each client $i$ , there exist constants $c_1$ and $c_2$ such that given its number of steps $E \cdot E_i$, for any $\epsilon < c_1 q_i^2 E \cdot E_i$, the output model of 0.90Robust-HDP satisfies $(\epsilon_i, \delta_i)-$DP with respect to $\mathcal{D}_i$ for any $\delta_i>0$ if $z_i > c_2 \frac

Figures (13)

Figure 1: Security model in heterogeneous DPFL, where client $i$ has local train data $\mathcal{D}_i$ and privacy parameters $(\epsilon_i, \delta_i)$, and does not trust any external parties.
Figure 2: Left: 3D plot of noise variance $\sigma_i^2$ of a client $i$ (\ref{['eq:sigma_i^2']} with $K_i=1, N_i=2400, \eta_l = 0.01, c=3 , p=28939$) based on $b_i$ and the privacy budget $\epsilon_i$. Right: the noise variances $\{\sigma_i^2\}_{i=1}^n$ in a DPFL system with $n=20$ clients, where $\{(\epsilon_i, b_i)\}_{i=1}^n$ are randomly selected for each client. It clearly shows an approximately sparse pattern (14 of the clients have much smaller noise variance than the other 6). Each bar plot in the right figure corresponds to a point in the left figure.
Figure 3: Comparison of average test accuracy between studied algorithms. See Tables \ref{['table:mnist']} to \ref{['table:cifar100']} in the appendix for detailed results.
Figure 4: Convergence speed comparison on MNIST and Dist6. Minimum $\epsilon$ algorithm diverged in 1 out of 3 trials.
Figure 5: Performance comparison on MNIST. Left: effect of clients desired privacy on utility (detailed results in \ref{['table:ablation_privacy_preference']}) Middle: effect of number of existing clients (privacy parameters of clients are sampled from Dist6) on utility (detailed results in \ref{['table:ablation_num_clients']}) Right: Robustness of 0.90Robust-HDP when a random client (client 12 with a moderate $\epsilon$ value of $0.95$) sends falsified version of its $\epsilon$ to the server for aggregation (privacy parameters of other clients are sampled from Dist5). 0.90WeiAvg and 0.90PFA are much vulnerable to this falsification.
...and 8 more figures

Theorems & Definitions (13)

Definition 2.1: ($\epsilon,\delta$)-DP Dwork2006OurDO
Theorem 3.1
Theorem 3.2: Robust-HDP
Lemma 4.3: Relaxed triangle inequality
proof
Lemma 4.4
proof
Theorem 5.1
proof
Theorem 5.1: Robust-HDP
...and 3 more

Noise-Aware Algorithm for Heterogeneous Differentially Private Federated Learning

TL;DR

Abstract

Noise-Aware Algorithm for Heterogeneous Differentially Private Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (13)