Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks

Chulin Xie; Yunhui Long; Pin-Yu Chen; Qinbin Li; Arash Nourian; Sanmi Koyejo; Bo Li

Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks

Chulin Xie, Yunhui Long, Pin-Yu Chen, Qinbin Li, Arash Nourian, Sanmi Koyejo, Bo Li

TL;DR

The paper dissects how differential privacy in Federated Learning interacts with certified robustness against poisoning attacks, introducing two robustness criteria: certified prediction and certified attack inefficacy. It develops formal guarantees for both user-level and instance-level DPFL, showing that DP can yield certifiable resilience against a bounded number of adversaries, with robustness scaling with privacy parameters and data characteristics. The authors analyze privacy under DP-FL, provide improved privacy guarantees for FedSGD and FedAvg, and extend certifiability to instance-level DPFL, all supported by extensive experiments on MNIST, CIFAR, and Sent140 under multiple poisoning attacks. Empirically, stronger privacy generally enhances certified attack inefficacy but exhibits a nuanced effect on certified prediction, underscoring a privacy-utility tradeoff. The work offers a principled framework for measuring and improving private and robust FL deployments, highlighting practical guidance for selecting DP mechanisms and accounting approaches to achieve desired certification levels.

Abstract

Federated learning (FL) provides an efficient paradigm to jointly train a global model leveraging data from distributed users. As local training data comes from different users who may not be trustworthy, several studies have shown that FL is vulnerable to poisoning attacks. Meanwhile, to protect the privacy of local users, FL is usually trained in a differentially private way (DPFL). Thus, in this paper, we ask: What are the underlying connections between differential privacy and certified robustness in FL against poisoning attacks? Can we leverage the innate privacy property of DPFL to provide certified robustness for FL? Can we further improve the privacy of FL to improve such robustness certification? We first investigate both user-level and instance-level privacy of FL and provide formal privacy analysis to achieve improved instance-level privacy. We then provide two robustness certification criteria: certified prediction and certified attack inefficacy for DPFL on both user and instance levels. Theoretically, we provide the certified robustness of DPFL based on both criteria given a bounded number of adversarial users or instances. Empirically, we conduct extensive experiments to verify our theories under a range of poisoning attacks on different datasets. We find that increasing the level of privacy protection in DPFL results in stronger certified attack inefficacy; however, it does not necessarily lead to a stronger certified prediction. Thus, achieving the optimal certified prediction requires a proper balance between privacy and utility loss.

Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks

TL;DR

Abstract

Paper Structure (68 sections, 20 theorems, 27 equations, 15 figures, 12 tables)

This paper contains 68 sections, 20 theorems, 27 equations, 15 figures, 12 tables.

Introduction
Related work
Differentially Private Federated Learning
Certified Robustness against Evasion Attacks
Certified Robustness against Poisoning Attacks
Preliminaries
Differential Privacy.
Federated Learning.
User-level DP and Certified Robustness
User-level DP and Background
Certified Robustness of User-level DPFL
Threat Model.
Certified Prediction
Certified Attack Inefficacy
Instance-level DP and Certified Robustness
...and 53 more sections

Key Result

Theorem 1

Suppose a randomized mechanism $\mathcal{M}$ satisfies user-level $(\epsilon, \delta)$-DP. For two user sets $B$ and $B^\prime$ that differ by one user, let $D$ and $D'$ be the corresponding training datasets. For a test input $x$, suppose $\mathbb{A}, \mathbb{B} \in [C]$ satisfy $\mathbb{A}=\arg \m

Figures (15)

Figure 1: Certified accuracy of UserDP-FedAvg under different privacy budgets $\epsilon$.
Figure 2: Certified accuracy of UserDP-FedAvg under different user-level DPFL algorithms with the same $\epsilon$.
Figure 3: Certified accuracy of UserDP-FedAvg under varying levels of data heterogeneity. We use Dirichlet distribution $\operatorname{Dir}(\alpha)$ to create FL heterogeneous data distributions, where smaller $\alpha$ indicates greater heterogeneity.
Figure 4: Certified attack inefficacy of UserDP-FedAvg given different $k$, under various attacks with different $\alpha$ or $\gamma$.
Figure 5: Certified accuracy (a) and certified attack inefficacy of InsDP-FedAvg on MNIST under different attacks given different $k$ (b-c) and different $\epsilon$ (d-e).
...and 10 more figures

Theorems & Definitions (30)

Definition 1: $(\epsilon,\delta)$-DP dwork2006our
Definition 2: Group DP
Definition 3: User-level $(\epsilon,\delta)$-DP
Theorem 1: Certified Prediction under One Adversarial User
Theorem 2: Upper Bound of $k$ for Certified Prediction
Example 1
Example 2
Theorem 3: Attack Inefficacy with $k$ Attackers
Corollary 1: Lower Bound of $k$ Given $\tau$, extended from ma2019data
Definition 4: Instance-level $(\epsilon,\delta)$-DP
...and 20 more

Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks

TL;DR

Abstract

Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (30)