BICompFL: Stochastic Federated Learning with Bi-Directional Compression
Maximilian Egger, Rawad Bitar, Antonia Wachter-Zeh, Nir Weinberger, Deniz Gündüz
TL;DR
BiCompFL tackles the downlink bottleneck in stochastic federated learning by introducing bi-directional compression via minimal random coding with side information. It presents two algorithms, BiCompFL-GR and BiCompFL-PR, that exploit global or private shared randomness to send samples from local posteriors and reconstruct the global model with reduced communication. The authors provide contraction-based convergence guarantees and a theoretical framework for KL-divergence costs, including a refined analysis for Bernoulli distributions. Empirically, BICompFL achieves order-of-magnitude reductions in communication costs across IID and non-IID data on MNIST, Fashion-MNIST, and CIFAR-10 while maintaining state-of-the-art accuracies, and demonstrates downlink partitioning to further cut costs in favorable regimes.
Abstract
We address the prominent communication bottleneck in federated learning (FL). We specifically consider stochastic FL, in which models or compressed model updates are specified by distributions rather than deterministic parameters. Stochastic FL offers a principled approach to compression, and has been shown to reduce the communication load under perfect downlink transmission from the federator to the clients. However, in practice, both the uplink and downlink communications are constrained. We show that bi-directional compression for stochastic FL has inherent challenges, which we address by introducing BICompFL. Our BICompFL is experimentally shown to reduce the communication cost by an order of magnitude compared to multiple benchmarks, while maintaining state-of-the-art accuracies. Theoretically, we study the communication cost of BICompFL through a new analysis of an importance-sampling based technique, which exposes the interplay between uplink and downlink communication costs.
