Expanding the Reach of Federated Learning by Reducing Client Resource Requirements
Sebastian Caldas, Jakub Konečny, H. Brendan McMahan, Ameet Talwalkar
TL;DR
This paper tackles the server-to-client communication bottleneck in Federated Learning on heterogeneous edge networks by introducing two strategies: lossless-friendly yet lossy compression of the server-downloaded global model using Kashin's representation (with basis transforms, subsampling, and probabilistic quantization) and Federated Dropout, which trains and transmits sub-model updates. Together, these approaches can be combined with existing client-to-server compression to achieve large end-to-end reductions in communication and local computation while maintaining accuracy, as demonstrated on MNIST, CIFAR-10, and EMNIST. The results show up to $14\times$ server-to-client, $28\times$ client-to-server, and $1.7\times$ local computation savings, with a manageable impact on convergence speed, thereby enabling higher-capacity models and broader user participation. The work also outlines practical considerations and future research directions, including fairness considerations and personalized sub-model aggregation in FL.
Abstract
Communication on heterogeneous edge networks is a fundamental bottleneck in Federated Learning (FL), restricting both model capacity and user participation. To address this issue, we introduce two novel strategies to reduce communication costs: (1) the use of lossy compression on the global model sent server-to-client; and (2) Federated Dropout, which allows users to efficiently train locally on smaller subsets of the global model and also provides a reduction in both client-to-server communication and local computation. We empirically show that these strategies, combined with existing compression approaches for client-to-server communication, collectively provide up to a $14\times$ reduction in server-to-client communication, a $1.7\times$ reduction in local computation, and a $28\times$ reduction in upload communication, all without degrading the quality of the final model. We thus comprehensively reduce FL's impact on client device resources, allowing higher capacity models to be trained, and a more diverse set of users to be reached.
