Think Locally, Act Globally: Federated Learning with Local and Global Representations
Paul Pu Liang, Terrance Liu, Liu Ziyin, Nicholas B. Allen, Randy P. Auerbach, David Brent, Ruslan Salakhutdinov, Louis-Philippe Morency
TL;DR
LG-FedAvg introduces a local-global federated learning framework that trains compact local representations on edge devices and a smaller global model operating on these representations, reducing communication while preserving accuracy. The authors provide a theory-backed bias-variance analysis showing the ensemble can mitigate both data variance and device variance, and they validate the approach across image, multimodal, and mobile sensing tasks, including fairness-aware variants. Empirical results demonstrate improved communication efficiency, robustness to non-iid data, and personalization capability, with applications to mood prediction and online data shifts. The work offers a versatile, scalable framework for private, heterogeneous, and fair federated learning with broad potential impact on real-world deployment.
Abstract
Federated learning is a method of training models on private data distributed over multiple devices. To keep device data private, the global model is trained by only communicating parameters and updates which poses scalability challenges for large models. To this end, we propose a new federated learning algorithm that jointly learns compact local representations on each device and a global model across all devices. As a result, the global model can be smaller since it only operates on local representations, reducing the number of communicated parameters. Theoretically, we provide a generalization analysis which shows that a combination of local and global models reduces both variance in the data as well as variance across device distributions. Empirically, we demonstrate that local models enable communication-efficient training while retaining performance. We also evaluate on the task of personalized mood prediction from real-world mobile data where privacy is key. Finally, local models handle heterogeneous data from new devices, and learn fair representations that obfuscate protected attributes such as race, age, and gender.
