Attacking Byzantine Robust Aggregation in High Dimensions
Sarthak Choudhary, Aashish Kolluri, Prateek Saxena
TL;DR
The paper addresses the challenge of computing a robust, mean-like statistic in high-dimensional settings under an $\\epsilon$-fraction of Byzantine corruptions. It analyzes strong, dimension-independent aggregation bounds and introduces HiDRA, an untargeted poisoning attack that circumvents practical defenses by exploiting a fundamental computational bottleneck in maximum-variance-direction calculations. The authors prove near-optimal bias bounds of $\\Omega(\\sqrt{\\epsilon d})$ per chunk and demonstrate via extensive experiments that HiDRA can cause drastic drops in model accuracy across standard benchmarks, even when aggregators provide strong theoretical guarantees. The work highlights a critical gap between information-theoretic analyses and practical realizations, suggesting that new defense strategies must address high-dimensional computational challenges to restore robustness in real-world ML training.
Abstract
Training modern neural networks or models typically requires averaging over a sample of high-dimensional vectors. Poisoning attacks can skew or bias the average vectors used to train the model, forcing the model to learn specific patterns or avoid learning anything useful. Byzantine robust aggregation is a principled algorithmic defense against such biasing. Robust aggregators can bound the maximum bias in computing centrality statistics, such as mean, even when some fraction of inputs are arbitrarily corrupted. Designing such aggregators is challenging when dealing with high dimensions. However, the first polynomial-time algorithms with strong theoretical bounds on the bias have recently been proposed. Their bounds are independent of the number of dimensions, promising a conceptual limit on the power of poisoning attacks in their ongoing arms race against defenses. In this paper, we show a new attack called HIDRA on practical realization of strong defenses which subverts their claim of dimension-independent bias. HIDRA highlights a novel computational bottleneck that has not been a concern of prior information-theoretic analysis. Our experimental evaluation shows that our attacks almost completely destroy the model performance, whereas existing attacks with the same goal fail to have much effect. Our findings leave the arms race between poisoning attacks and provable defenses wide open.
