Mean Aggregator is More Robust than Robust Aggregators under Label Poisoning Attacks on Distributed Heterogeneous Data

Jie Peng; Weiyu Li; Stefan Vlaski; Qing Ling

Mean Aggregator is More Robust than Robust Aggregators under Label Poisoning Attacks on Distributed Heterogeneous Data

Jie Peng, Weiyu Li, Stefan Vlaski, Qing Ling

TL;DR

This work addresses robustness in distributed learning under label poisoning, revealing that when data across workers are sufficiently heterogeneous, the simple mean aggregator can achieve order-optimal learning performance compared to state-of-the-art robust aggregators. The authors develop a convergent stochastic-momentum framework and derive precise bounds for both mean and ρ-robust aggregators, showing the latter incur a neighborhood error proportional to ρ^2ξ^2 while the mean aggregator’s error scales with δ^2A^2. They establish a lower bound for identity-invariant methods, indicating the mean aggregator can be optimally robust under label poisoning when heterogeneity is high. Empirical results on convex and nonconvex problems with MNIST/CIFAR-10 data corroborate theory, demonstrating practical guidance: use the mean aggregator in highly heterogeneous settings and moderate attack strength to maintain performance, potentially reducing the need for complex robust-aggregation schemes.

Abstract

Robustness to malicious attacks is of paramount importance for distributed learning. Existing works usually consider the classical Byzantine attacks model, which assumes that some workers can send arbitrarily malicious messages to the server and disturb the aggregation steps of the distributed learning process. To defend against such worst-case Byzantine attacks, various robust aggregators have been proposed. They are proven to be effective and much superior to the often-used mean aggregator. In this paper, however, we demonstrate that the robust aggregators are too conservative for a class of weak but practical malicious attacks, as known as label poisoning attacks, where the sample labels of some workers are poisoned. Surprisingly, we are able to show that the mean aggregator is more robust than the state-of-the-art robust aggregators in theory, given that the distributed data are sufficiently heterogeneous. In fact, the learning error of the mean aggregator is proven to be order-optimal in this case. Experimental results corroborate our theoretical findings, showing the superiority of the mean aggregator under label poisoning attacks.

Mean Aggregator is More Robust than Robust Aggregators under Label Poisoning Attacks on Distributed Heterogeneous Data

TL;DR

Abstract

Paper Structure (27 sections, 14 theorems, 155 equations, 12 figures, 3 tables, 1 algorithm)

This paper contains 27 sections, 14 theorems, 155 equations, 12 figures, 3 tables, 1 algorithm.

Introduction
Related Works
Problem Formulation
Convergence Analysis
Justification of Assumption \ref{['Assump: bounded effect of poisoned local gradients']}
Main Results
Numerical Experiments
Experimental Settings
Convex Case
Nonconvex Case
Impacts of Heterogeneity and Attack Strengths
Conclusions
Analysis of Distributed Softmax Regression
Bounded Gradients of Local Costs
Proof of Lemma \ref{['lemma: bound of softmax regression']}
...and 12 more sections

Key Result

Lemma 2

Consider the distributed softmax regression problem where the local costs of the workers are in the forms of ex1: local cost function of softmax regression and ex1: local cost function of softmax regression-p. Therein, the poisoned workers are under label poisoning attacks, with arbitrary fractions

Figures (12)

Figure 1: Accuracies of softmax regression on the MNIST dataset under static label flipping attacks.
Figure 2: Accuracies of softmax regression on the MNIST dataset under dynamic label flipping attacks.
Figure 3: Heterogeneity of regular local gradients (the smallest $\xi$ satisfying Assumption \ref{['Assump: bounded heterogeneity']}) and disturbance of poisoned local gradients (the smallest $A$ satisfying Assumption \ref{['Assump: bounded effect of poisoned local gradients']}) in softmax regression on the MNIST dataset, under static label flipping and dynamic label flipping attacks.
Figure 4: Accuracies of two-layer perceptrons on the MNIST dataset and convolutional neural networks on the CIFAR10 dataset under static label flipping attacks.
Figure 5: Accuracies of two-layer perceptrons on the MNIST dataset and convolutional neural networks on the CIFAR10 dataset under dynamic label flipping attacks.
...and 7 more figures

Theorems & Definitions (18)

Definition 1: Label poisoning attacks
Lemma 2
Lemma 3
Definition 4: $\rho$-robust aggregator
Lemma 5
Remark 6
Theorem 7
Theorem 8
Theorem 9
Remark 10
...and 8 more

Mean Aggregator is More Robust than Robust Aggregators under Label Poisoning Attacks on Distributed Heterogeneous Data

TL;DR

Abstract

Mean Aggregator is More Robust than Robust Aggregators under Label Poisoning Attacks on Distributed Heterogeneous Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (18)