Robust and Communication-Efficient Federated Learning from Non-IID Data

Felix Sattler; Simon Wiedemann; Klaus-Robert Müller; Wojciech Samek

Robust and Communication-Efficient Federated Learning from Non-IID Data

Felix Sattler, Simon Wiedemann, Klaus-Robert Müller, Wojciech Samek

TL;DR

This work tackles the high communication cost of Federated Learning under non-IID data by introducing Sparse Ternary Compression (STC), a framework that compresses both upstream and downstream updates using sparsification, ternarization to {$-\,\mu,0,\mu$}, residual accumulation, and Golomb encoding. It further extends STC with server-side downstream compression, a weight-update caching mechanism for partial participation, and redundancy elimination via binarization, achieving strong bidirectional compression with minimal accuracy loss. Empirical results across four tasks show STC consistently outperforms Federated Averaging and signSGD in non-IID, small-batch, and low-participation regimes, while delivering substantial communication savings even in IID settings. The method enables robust, bandwidth-efficient Federated Learning suitable for large-scale IoT deployments, where high-frequency, low-volume communication is preferable to infrequent, high-volume transfers. The results hinge on carefully designed encoding and residual mechanisms that keep updates accurate despite aggressive sparsification.

Abstract

Federated Learning allows multiple parties to jointly train a deep learning model on their combined data, without any of the participants having to reveal their local data to a centralized server. This form of privacy-preserving collaborative learning however comes at the cost of a significant communication overhead during training. To address this problem, several compression methods have been proposed in the distributed training literature that can reduce the amount of required communication by up to three orders of magnitude. These existing methods however are only of limited utility in the Federated Learning setting, as they either only compress the upstream communication from the clients to the server (leaving the downstream communication uncompressed) or only perform well under idealized conditions such as iid distribution of the client data, which typically can not be found in Federated Learning. In this work, we propose Sparse Ternary Compression (STC), a new compression framework that is specifically designed to meet the requirements of the Federated Learning environment. Our experiments on four different learning tasks demonstrate that STC distinctively outperforms Federated Averaging in common Federated Learning scenarios where clients either a) hold non-iid data, b) use small batch sizes during training, or where c) the number of clients is large and the participation rate in every communication round is low. We furthermore show that even if the clients hold iid data and use medium sized batches for training, STC still behaves pareto-superior to Federated Averaging in the sense that it achieves fixed target accuracies on our benchmarks within both fewer training iterations and a smaller communication budget.

Robust and Communication-Efficient Federated Learning from Non-IID Data

TL;DR

Abstract

Robust and Communication-Efficient Federated Learning from Non-IID Data

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)