Table of Contents
Fetching ...

Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks

Chulin Xie, Yunhui Long, Pin-Yu Chen, Qinbin Li, Arash Nourian, Sanmi Koyejo, Bo Li

TL;DR

The paper dissects how differential privacy in Federated Learning interacts with certified robustness against poisoning attacks, introducing two robustness criteria: certified prediction and certified attack inefficacy. It develops formal guarantees for both user-level and instance-level DPFL, showing that DP can yield certifiable resilience against a bounded number of adversaries, with robustness scaling with privacy parameters and data characteristics. The authors analyze privacy under DP-FL, provide improved privacy guarantees for FedSGD and FedAvg, and extend certifiability to instance-level DPFL, all supported by extensive experiments on MNIST, CIFAR, and Sent140 under multiple poisoning attacks. Empirically, stronger privacy generally enhances certified attack inefficacy but exhibits a nuanced effect on certified prediction, underscoring a privacy-utility tradeoff. The work offers a principled framework for measuring and improving private and robust FL deployments, highlighting practical guidance for selecting DP mechanisms and accounting approaches to achieve desired certification levels.

Abstract

Federated learning (FL) provides an efficient paradigm to jointly train a global model leveraging data from distributed users. As local training data comes from different users who may not be trustworthy, several studies have shown that FL is vulnerable to poisoning attacks. Meanwhile, to protect the privacy of local users, FL is usually trained in a differentially private way (DPFL). Thus, in this paper, we ask: What are the underlying connections between differential privacy and certified robustness in FL against poisoning attacks? Can we leverage the innate privacy property of DPFL to provide certified robustness for FL? Can we further improve the privacy of FL to improve such robustness certification? We first investigate both user-level and instance-level privacy of FL and provide formal privacy analysis to achieve improved instance-level privacy. We then provide two robustness certification criteria: certified prediction and certified attack inefficacy for DPFL on both user and instance levels. Theoretically, we provide the certified robustness of DPFL based on both criteria given a bounded number of adversarial users or instances. Empirically, we conduct extensive experiments to verify our theories under a range of poisoning attacks on different datasets. We find that increasing the level of privacy protection in DPFL results in stronger certified attack inefficacy; however, it does not necessarily lead to a stronger certified prediction. Thus, achieving the optimal certified prediction requires a proper balance between privacy and utility loss.

Unraveling the Connections between Privacy and Certified Robustness in Federated Learning Against Poisoning Attacks

TL;DR

The paper dissects how differential privacy in Federated Learning interacts with certified robustness against poisoning attacks, introducing two robustness criteria: certified prediction and certified attack inefficacy. It develops formal guarantees for both user-level and instance-level DPFL, showing that DP can yield certifiable resilience against a bounded number of adversaries, with robustness scaling with privacy parameters and data characteristics. The authors analyze privacy under DP-FL, provide improved privacy guarantees for FedSGD and FedAvg, and extend certifiability to instance-level DPFL, all supported by extensive experiments on MNIST, CIFAR, and Sent140 under multiple poisoning attacks. Empirically, stronger privacy generally enhances certified attack inefficacy but exhibits a nuanced effect on certified prediction, underscoring a privacy-utility tradeoff. The work offers a principled framework for measuring and improving private and robust FL deployments, highlighting practical guidance for selecting DP mechanisms and accounting approaches to achieve desired certification levels.

Abstract

Federated learning (FL) provides an efficient paradigm to jointly train a global model leveraging data from distributed users. As local training data comes from different users who may not be trustworthy, several studies have shown that FL is vulnerable to poisoning attacks. Meanwhile, to protect the privacy of local users, FL is usually trained in a differentially private way (DPFL). Thus, in this paper, we ask: What are the underlying connections between differential privacy and certified robustness in FL against poisoning attacks? Can we leverage the innate privacy property of DPFL to provide certified robustness for FL? Can we further improve the privacy of FL to improve such robustness certification? We first investigate both user-level and instance-level privacy of FL and provide formal privacy analysis to achieve improved instance-level privacy. We then provide two robustness certification criteria: certified prediction and certified attack inefficacy for DPFL on both user and instance levels. Theoretically, we provide the certified robustness of DPFL based on both criteria given a bounded number of adversarial users or instances. Empirically, we conduct extensive experiments to verify our theories under a range of poisoning attacks on different datasets. We find that increasing the level of privacy protection in DPFL results in stronger certified attack inefficacy; however, it does not necessarily lead to a stronger certified prediction. Thus, achieving the optimal certified prediction requires a proper balance between privacy and utility loss.
Paper Structure (68 sections, 20 theorems, 27 equations, 15 figures, 12 tables)

This paper contains 68 sections, 20 theorems, 27 equations, 15 figures, 12 tables.

Key Result

Theorem 1

Suppose a randomized mechanism $\mathcal{M}$ satisfies user-level $(\epsilon, \delta)$-DP. For two user sets $B$ and $B^\prime$ that differ by one user, let $D$ and $D'$ be the corresponding training datasets. For a test input $x$, suppose $\mathbb{A}, \mathbb{B} \in [C]$ satisfy $\mathbb{A}=\arg \m

Figures (15)

  • Figure 1: Certified accuracy of UserDP-FedAvg under different privacy budgets $\epsilon$.
  • Figure 2: Certified accuracy of UserDP-FedAvg under different user-level DPFL algorithms with the same $\epsilon$.
  • Figure 3: Certified accuracy of UserDP-FedAvg under varying levels of data heterogeneity. We use Dirichlet distribution $\operatorname{Dir}(\alpha)$ to create FL heterogeneous data distributions, where smaller $\alpha$ indicates greater heterogeneity.
  • Figure 4: Certified attack inefficacy of UserDP-FedAvg given different $k$, under various attacks with different $\alpha$ or $\gamma$.
  • Figure 5: Certified accuracy (a) and certified attack inefficacy of InsDP-FedAvg on MNIST under different attacks given different $k$ (b-c) and different $\epsilon$ (d-e).
  • ...and 10 more figures

Theorems & Definitions (30)

  • Definition 1: $(\epsilon,\delta)$-DP dwork2006our
  • Definition 2: Group DP
  • Definition 3: User-level $(\epsilon,\delta)$-DP
  • Theorem 1: Certified Prediction under One Adversarial User
  • Theorem 2: Upper Bound of $k$ for Certified Prediction
  • Example 1
  • Example 2
  • Theorem 3: Attack Inefficacy with $k$ Attackers
  • Corollary 1: Lower Bound of $k$ Given $\tau$, extended from ma2019data
  • Definition 4: Instance-level $(\epsilon,\delta)$-DP
  • ...and 20 more