On the Tradeoff between Privacy Preservation and Byzantine-Robustness in Decentralized Learning

Haoxiang Ye; Heng Zhu; Qing Ling

On the Tradeoff between Privacy Preservation and Byzantine-Robustness in Decentralized Learning

Haoxiang Ye, Heng Zhu, Qing Ling

TL;DR

This paper tackles the joint problem of privacy preservation and Byzantine-robustness in decentralized learning by proposing a generic SGD framework that injects Gaussian noise for differential privacy and employs robust aggregation rules to mitigate Byzantine attacks. It formalizes the impact of privacy noise on learning through a contraction-based analysis using a virtual mixing matrix $W$ and a contraction constant $\\rho$, revealing a fundamental tradeoff: higher noise improves privacy but worsens learning accuracy under Byzantine adversaries. The authors unify and quantify the mixing abilities of state-of-the-art robust rules (Trimmed Mean, SCC, IOS), showing that rules with smaller $\\rho$ and near-doubly stochastic $W$ yield more favorable privacy-robustness tradeoffs, with IOS often outperforming others in practice. Theoretical results are complemented by extensive experiments on MNIST and CIFAR10, including attacks and large networks, confirming the key insights and providing design guidelines for robust, privacy-preserving decentralized learning systems.

Abstract

This paper jointly considers privacy preservation and Byzantine-robustness in decentralized learning. In a decentralized network, honest-but-curious agents faithfully follow the prescribed algorithm, but expect to infer their neighbors' private data from messages received during the learning process, while dishonest-and-Byzantine agents disobey the prescribed algorithm, and deliberately disseminate wrong messages to their neighbors so as to bias the learning process. For this novel setting, we investigate a generic privacy-preserving and Byzantine-robust decentralized stochastic gradient descent (SGD) framework, in which Gaussian noise is injected to preserve privacy and robust aggregation rules are adopted to counteract Byzantine attacks. We analyze its learning error and privacy guarantee, discovering an essential tradeoff between privacy preservation and Byzantine-robustness in decentralized learning -- the learning error caused by defending against Byzantine attacks is exacerbated by the Gaussian noise added to preserve privacy. For a class of state-of-the-art robust aggregation rules, we give unified analysis of the "mixing abilities". Building upon this analysis, we reveal how the "mixing abilities" affect the tradeoff between privacy preservation and Byzantine-robustness. The theoretical results provide guidelines for achieving a favorable tradeoff with proper design of robust aggregation rules. Numerical experiments are conducted and corroborate our theoretical findings.

On the Tradeoff between Privacy Preservation and Byzantine-Robustness in Decentralized Learning

TL;DR

and a contraction constant

, revealing a fundamental tradeoff: higher noise improves privacy but worsens learning accuracy under Byzantine adversaries. The authors unify and quantify the mixing abilities of state-of-the-art robust rules (Trimmed Mean, SCC, IOS), showing that rules with smaller

and near-doubly stochastic

yield more favorable privacy-robustness tradeoffs, with IOS often outperforming others in practice. Theoretical results are complemented by extensive experiments on MNIST and CIFAR10, including attacks and large networks, confirming the key insights and providing design guidelines for robust, privacy-preserving decentralized learning systems.

Abstract

Paper Structure (31 sections, 9 theorems, 87 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 31 sections, 9 theorems, 87 equations, 9 figures, 4 tables, 1 algorithm.

Introduction
Our Contributions
Related Works
Paper Organization
Problem Statement
Privacy-preserving and Byzantine-robust Decentralized Learning
Learning Error and Privacy Guarantee
Disagreement Measure
Learning Error
Privacy Guarantee
Tradeoff between Privacy Preservation and Byzantine-Robustness
Contraction Constants and Virtual Mixing Matrices of Robust Aggregation Rules
Robust Aggregation Rules
Contraction Constants and Virtual Mixing Matrices
Contraction Constants and Virtual Mixing Matrices on Fully Connected Topology: A Case Study
...and 16 more sections

Key Result

Theorem 1

(Disagreement measure). Suppose that the robust aggregation rules $\{\mathcal{A}_n\}_{n\in {\mathcal{R}}}$ in Algorithm dprobust satisfy inequality:robustness-of-aggregation-local, and $\rho$ satisfies Set the step size $\alpha^k= \frac{8}{\mu (k+k_0)}$, where $k_0$ is sufficiently large, and set $\sigma^k =C \alpha^k$. Under Assumptions assumption:connection--assumption:gradients, for Algorithm

Figures (9)

Figure 1: Different robust aggregation rules in non-i.i.d. setting on MNIST.
Figure 2: Different robust aggregation rules in i.i.d. setting on MNIST.
Figure 3: Different robust aggregation rules under different noise levels in non-i.i.d. setting on MNIST.
Figure 4: Different robust aggregation rules under different noise levels in i.i.d. setting on MNIST.
Figure 5: Different robust aggregation rules in non-i.i.d. setting on MNIST for the network of 100 agents.
...and 4 more figures

Theorems & Definitions (23)

Remark 1
Definition 1
Remark 2
Remark 3
Theorem 1
Theorem 2
Corollary 1
Remark 4
Definition 2: $(\varepsilon,\delta)$-differential privacyDwork2006
Definition 3
...and 13 more

On the Tradeoff between Privacy Preservation and Byzantine-Robustness in Decentralized Learning

TL;DR

Abstract

On the Tradeoff between Privacy Preservation and Byzantine-Robustness in Decentralized Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (23)