DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum

Xiaolan Gu; Ming Li; Li Xiong

DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum

Xiaolan Gu, Ming Li, Li Xiong

TL;DR

This paper focuses on simultaneously achieving differential privacy (DP) and Byzantine robustness for cross-silo FL, based on the idea of learning from history, and develops the final solution called DP-BREM+, which achieves the same DP and robustness properties as DP-BREM without a trusted server.

Abstract

Federated Learning (FL) allows multiple participating clients to train machine learning models collaboratively while keeping their datasets local and only exchanging the gradient or model updates with a coordinating server. Existing FL protocols are vulnerable to attacks that aim to compromise data privacy and/or model robustness. Recently proposed defenses focused on ensuring either privacy or robustness, but not both. In this paper, we focus on simultaneously achieving differential privacy (DP) and Byzantine robustness for cross-silo FL, based on the idea of learning from history. The robustness is achieved via client momentum, which averages the updates of each client over time, thus reducing the variance of the honest clients and exposing the small malicious perturbations of Byzantine clients that are undetectable in a single round but accumulate over time. In our initial solution DP-BREM, DP is achieved by adding noise to the aggregated momentum, and we account for the privacy cost from the momentum, which is different from the conventional DP-SGD that accounts for the privacy cost from the gradient. Since DP-BREM assumes a trusted server (who can obtain clients' local models or updates), we further develop the final solution called DP-BREM+, which achieves the same DP and robustness properties as DP-BREM without a trusted server by utilizing secure aggregation techniques, where DP noise is securely and jointly generated by the clients. Both theoretical analysis and experimental results demonstrate that our proposed protocols achieve better privacy-utility tradeoff and stronger Byzantine robustness than several baseline methods, under different DP budgets and attack settings.

DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum

TL;DR

Abstract

Paper Structure (35 sections, 20 theorems, 54 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 35 sections, 20 theorems, 54 equations, 6 figures, 6 tables, 1 algorithm.

Introduction
Preliminaries
Differential Privacy (DP)
Federated Learning (FL) with DP
Byzantine Attacks and Defenses
Problem Statement and Motivation
Problem Statement
Challenges and Baseline
DP-BREM
Algorithm Design
Privacy Analysis
Convergence Analysis
DP-BREM+ with Secure Aggregation
Challenges
Design of DP-BREM+
...and 20 more sections

Key Result

Lemma 1

With some parameter tuning, the convergence rate of the Byzantine-robust algorithm LFH is asymptotically (ignoring constants and higher order terms) of the order where $\ell(\cdot)$ is the loss function, $T$ is the total number of training iterations, $|\mathcal{B}|$ is the number of Byzantine clients, $n$ is the number of all clients, and $\rho$ is a parameter that quantifies the variance of hon

Figures (6)

Figure 1: Illustration of our DP-BREM algorithm.
Figure 2: Illustration of DP-BREM+ (see Appendix \ref{['apx:detailed_steps_of_secure_aggregation']} for detailed steps 1- 7)
Figure 3: With fixed privacy budget $\epsilon$, varying the percentage of Byzantine clients $\delta_B$ for three datasets.
Figure 4: With fixed percentage of Byzantine clients $\delta_B$, varying privacy budget $\epsilon$ for three datasets.
Figure 5: MNIST: Varying record-level clipping bound $R$ for DP-BREM under different settings.
...and 1 more figures

Theorems & Definitions (36)

Definition 1: $(\epsilon,\delta)$-DP dwork2014algorithmicdwork2006calibrating
Lemma 1: Convergence Rate of LFH karimireddy2021learning
Example 1: Sensitivity Computation: Average vs. Median
Lemma 2: DP Sensitivity
proof
Theorem 1: Privacy Analysis
proof
Theorem 2: Aggregation Error
proof
Theorem 3: Convergence Rate of DP-BREM
...and 26 more

DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum

TL;DR

Abstract

DP-BREM: Differentially-Private and Byzantine-Robust Federated Learning with Client Momentum

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (36)