Table of Contents
Fetching ...

DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum

Xiaolan Gu, Ming Li, Li Xiong

TL;DR

This paper focuses on simultaneously achieving differential privacy (DP) and Byzantine robustness for cross-silo FL, based on the idea of learning from history, and develops the final solution called DP-BREM+, which achieves the same DP and robustness properties as DP-BREM without a trusted server.

Abstract

Federated Learning (FL) allows multiple participating clients to train machine learning models collaboratively while keeping their datasets local and only exchanging the gradient or model updates with a coordinating server. Existing FL protocols are vulnerable to attacks that aim to compromise data privacy and/or model robustness. Recently proposed defenses focused on ensuring either privacy or robustness, but not both. In this paper, we focus on simultaneously achieving differential privacy (DP) and Byzantine robustness for cross-silo FL, based on the idea of learning from history. The robustness is achieved via client momentum, which averages the updates of each client over time, thus reducing the variance of the honest clients and exposing the small malicious perturbations of Byzantine clients that are undetectable in a single round but accumulate over time. In our initial solution DP-BREM, DP is achieved by adding noise to the aggregated momentum, and we account for the privacy cost from the momentum, which is different from the conventional DP-SGD that accounts for the privacy cost from the gradient. Since DP-BREM assumes a trusted server (who can obtain clients' local models or updates), we further develop the final solution called DP-BREM+, which achieves the same DP and robustness properties as DP-BREM without a trusted server by utilizing secure aggregation techniques, where DP noise is securely and jointly generated by the clients. Both theoretical analysis and experimental results demonstrate that our proposed protocols achieve better privacy-utility tradeoff and stronger Byzantine robustness than several baseline methods, under different DP budgets and attack settings.

DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum

TL;DR

This paper focuses on simultaneously achieving differential privacy (DP) and Byzantine robustness for cross-silo FL, based on the idea of learning from history, and develops the final solution called DP-BREM+, which achieves the same DP and robustness properties as DP-BREM without a trusted server.

Abstract

Federated Learning (FL) allows multiple participating clients to train machine learning models collaboratively while keeping their datasets local and only exchanging the gradient or model updates with a coordinating server. Existing FL protocols are vulnerable to attacks that aim to compromise data privacy and/or model robustness. Recently proposed defenses focused on ensuring either privacy or robustness, but not both. In this paper, we focus on simultaneously achieving differential privacy (DP) and Byzantine robustness for cross-silo FL, based on the idea of learning from history. The robustness is achieved via client momentum, which averages the updates of each client over time, thus reducing the variance of the honest clients and exposing the small malicious perturbations of Byzantine clients that are undetectable in a single round but accumulate over time. In our initial solution DP-BREM, DP is achieved by adding noise to the aggregated momentum, and we account for the privacy cost from the momentum, which is different from the conventional DP-SGD that accounts for the privacy cost from the gradient. Since DP-BREM assumes a trusted server (who can obtain clients' local models or updates), we further develop the final solution called DP-BREM+, which achieves the same DP and robustness properties as DP-BREM without a trusted server by utilizing secure aggregation techniques, where DP noise is securely and jointly generated by the clients. Both theoretical analysis and experimental results demonstrate that our proposed protocols achieve better privacy-utility tradeoff and stronger Byzantine robustness than several baseline methods, under different DP budgets and attack settings.
Paper Structure (35 sections, 20 theorems, 54 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 35 sections, 20 theorems, 54 equations, 6 figures, 6 tables, 1 algorithm.

Key Result

Lemma 1

With some parameter tuning, the convergence rate of the Byzantine-robust algorithm LFH is asymptotically (ignoring constants and higher order terms) of the order where $\ell(\cdot)$ is the loss function, $T$ is the total number of training iterations, $|\mathcal{B}|$ is the number of Byzantine clients, $n$ is the number of all clients, and $\rho$ is a parameter that quantifies the variance of hon

Figures (6)

  • Figure 1: Illustration of our DP-BREM algorithm.
  • Figure 2: Illustration of DP-BREM+ (see Appendix \ref{['apx:detailed_steps_of_secure_aggregation']} for detailed steps 1- 7)
  • Figure 3: With fixed privacy budget $\epsilon$, varying the percentage of Byzantine clients $\delta_B$ for three datasets.
  • Figure 4: With fixed percentage of Byzantine clients $\delta_B$, varying privacy budget $\epsilon$ for three datasets.
  • Figure 5: MNIST: Varying record-level clipping bound $R$ for DP-BREM under different settings.
  • ...and 1 more figures

Theorems & Definitions (36)

  • Definition 1: $(\epsilon,\delta)$-DP dwork2014algorithmicdwork2006calibrating
  • Lemma 1: Convergence Rate of LFH karimireddy2021learning
  • Example 1: Sensitivity Computation: Average vs. Median
  • Lemma 2: DP Sensitivity
  • proof
  • Theorem 1: Privacy Analysis
  • proof
  • Theorem 2: Aggregation Error
  • proof
  • Theorem 3: Convergence Rate of DP-BREM
  • ...and 26 more