A Huber Loss Minimization Approach to Byzantine Robust Federated Learning
Puning Zhao, Fei Yu, Zhiguo Wan
TL;DR
This paper introduces a gradient-aggregation method for Byzantine-robust federated learning based on a multi-dimensional Huber loss. By solving a weighted Huber loss minimization at each round, the server robustly aggregates client gradients without requiring exact knowledge of the attack fraction ε, and adapts to unbalanced and heterogeneous data. The authors provide theoretical guarantees under i.i.d. and non-i.i.d. settings, showing near-minimax ε-dependence and robust convergence across strong/convex and non-convex objectives. Implementation via a Weiszfeld-inspired algorithm enables practical deployment with linear-time per-iteration cost, and numerical experiments on synthetic and MNIST data demonstrate strong robustness against diverse Byzantine attacks. Overall, the approach offers a principled, ε-agnostic, scalable solution for robust gradient aggregation in federated learning with realistic data heterogeneity.
Abstract
Federated learning systems are susceptible to adversarial attacks. To combat this, we introduce a novel aggregator based on Huber loss minimization, and provide a comprehensive theoretical analysis. Under independent and identically distributed (i.i.d) assumption, our approach has several advantages compared to existing methods. Firstly, it has optimal dependence on $ε$, which stands for the ratio of attacked clients. Secondly, our approach does not need precise knowledge of $ε$. Thirdly, it allows different clients to have unequal data sizes. We then broaden our analysis to include non-i.i.d data, such that clients have slightly different distributions.
