Table of Contents
Fetching ...

Near-Optimal Resilient Aggregation Rules for Distributed Learning Using 1-Center and 1-Mean Clustering with Outliers

Yuhao Yi, Ronghui You, Hong Liu, Changxin Liu, Yuan Wang, Jiancheng Lv

TL;DR

The paper addresses Byzantine faults in distributed learning by casting robust aggregation as $1$-center and $1$-mean clustering with outliers. It introduces $2$-approximation aggregators CenterwO and MeanwO to achieve near-optimal resilience under multiple robustness criteria, and reveals that no single rule dominates under two attack types (sneak and siege). To resolve this, it proposes a two-phase framework, 2PRASHB, that generates two candidate models and lets honest clients vote to select the winner, balancing security and performance. The approach yields provable resilience guarantees and substantial empirical gains on both homogeneous and heterogeneous data distributions across image-classification tasks. Overall, the work provides a principled, scalable defense for resilient distributed learning with practical impact on large-scale systems.

Abstract

Byzantine machine learning has garnered considerable attention in light of the unpredictable faults that can occur in large-scale distributed learning systems. The key to secure resilience against Byzantine machines in distributed learning is resilient aggregation mechanisms. Although abundant resilient aggregation rules have been proposed, they are designed in ad-hoc manners, imposing extra barriers on comparing, analyzing, and improving the rules across performance criteria. This paper studies near-optimal aggregation rules using clustering in the presence of outliers. Our outlier-robust clustering approach utilizes geometric properties of the update vectors provided by workers. Our analysis show that constant approximations to the 1-center and 1-mean clustering problems with outliers provide near-optimal resilient aggregators for metric-based criteria, which have been proven to be crucial in the homogeneous and heterogeneous cases respectively. In addition, we discuss two contradicting types of attacks under which no single aggregation rule is guaranteed to improve upon the naive average. Based on the discussion, we propose a two-phase resilient aggregation framework. We run experiments for image classification using a non-convex loss function. The proposed algorithms outperform previously known aggregation rules by a large margin with both homogeneous and heterogeneous data distributions among non-faulty workers. Code and appendix are available at https://github.com/jerry907/AAAI24-RASHB.

Near-Optimal Resilient Aggregation Rules for Distributed Learning Using 1-Center and 1-Mean Clustering with Outliers

TL;DR

The paper addresses Byzantine faults in distributed learning by casting robust aggregation as -center and -mean clustering with outliers. It introduces -approximation aggregators CenterwO and MeanwO to achieve near-optimal resilience under multiple robustness criteria, and reveals that no single rule dominates under two attack types (sneak and siege). To resolve this, it proposes a two-phase framework, 2PRASHB, that generates two candidate models and lets honest clients vote to select the winner, balancing security and performance. The approach yields provable resilience guarantees and substantial empirical gains on both homogeneous and heterogeneous data distributions across image-classification tasks. Overall, the work provides a principled, scalable defense for resilient distributed learning with practical impact on large-scale systems.

Abstract

Byzantine machine learning has garnered considerable attention in light of the unpredictable faults that can occur in large-scale distributed learning systems. The key to secure resilience against Byzantine machines in distributed learning is resilient aggregation mechanisms. Although abundant resilient aggregation rules have been proposed, they are designed in ad-hoc manners, imposing extra barriers on comparing, analyzing, and improving the rules across performance criteria. This paper studies near-optimal aggregation rules using clustering in the presence of outliers. Our outlier-robust clustering approach utilizes geometric properties of the update vectors provided by workers. Our analysis show that constant approximations to the 1-center and 1-mean clustering problems with outliers provide near-optimal resilient aggregators for metric-based criteria, which have been proven to be crucial in the homogeneous and heterogeneous cases respectively. In addition, we discuss two contradicting types of attacks under which no single aggregation rule is guaranteed to improve upon the naive average. Based on the discussion, we propose a two-phase resilient aggregation framework. We run experiments for image classification using a non-convex loss function. The proposed algorithms outperform previously known aggregation rules by a large margin with both homogeneous and heterogeneous data distributions among non-faulty workers. Code and appendix are available at https://github.com/jerry907/AAAI24-RASHB.
Paper Structure (30 sections, 3 theorems, 26 equations, 4 figures, 5 tables, 4 algorithms)

This paper contains 30 sections, 3 theorems, 26 equations, 4 figures, 5 tables, 4 algorithms.

Key Result

Theorem 1

Suppose Assumptions assump:bounded_var and assump:bounded_hetero hold, and recall that $\mathcal{L}_{\mathcal{H}}(\cdot)$ is $L$-smooth. Consider Algorithm alg:skeleton and define $\text{Res}_T= T^{-1}\sum_{t=1}^T{\mathbb{E}}_{}\left[ \left\lVert\nabla \mathcal{L}_{\mathcal{H}}(\theta_{t-1})\right\

Figures (4)

  • Figure 1: Two types of attacks produce a dilemma for any robust aggregation rule. Blue circles are update vectors produced by honest clients. Black squares show update vectors provided by Byzantine clients. Aggregated values of the naive averaging rule and 1-center/mean with outliers rules (inner and outer averaging) are shown in (\ref{['fig:toyInAttack']}) and (\ref{['fig:toyOutAttack']}).
  • Figure 2: Performance comparison on the heterogeneous datasets at an adversarial rate of $0.2$, with the x, y axis representing testing accuracy and step number, respectively.
  • Figure 3: Performance comparison on the homogeneous datasets at an adversarial rate of $0.2$.
  • Figure 4: Performance comparison on the heterogeneous datasets at an adversarial rate of $0.2$.

Theorems & Definitions (16)

  • Definition 1: $(f,\varepsilon)$-Byzantine resilience
  • Definition 2: $(f,\lambda)$-resilient averaging
  • Definition 3: $(\delta_{\max}, \zeta)$-ARAgg
  • Definition 4: $(f,\kappa)$-robustness
  • Definition 5: $(f,\xi)$-robust averaging
  • Theorem 1
  • Definition 6: 1-center clustering with outliers, or minimum enclosing ball with outliers
  • Definition 7: 1-mean clustering with outliers
  • Definition 8
  • Lemma 1
  • ...and 6 more