Federated Learning on Virtual Heterogeneous Data with Local-global Distillation
Chun-Yin Huang, Ruinan Jin, Can Zhao, Daguang Xu, Xiaoxiao Li
TL;DR
FedLGD tackles data heterogeneity in federated learning by embedding dataset distillation into the FL loop using virtual data at both client and server. It combines Local Data Distillation via Iterative Distribution Matching in feature space with Global Data Distillation through Federated Gradient Matching to distill representative global information and harmonize domain shifts, complemented by a regularization term to align local and global representations. The approach is validated on DIGITS, CIFAR10C, and RETINA, showing improved accuracy over state-of-the-art heterogeneous FL methods and demonstrating robustness to varying IPCs, client numbers, and domain shifts; ablations illustrate the importance of the regularizer and the number of distillation steps. FedLGD also suggests privacy advantages by relying on synthesized data and gradient-based anchors, offering a practical path toward more efficient and privacy-conscious FL in heterogeneous environments.
Abstract
While Federated Learning (FL) is gaining popularity for training machine learning models in a decentralized fashion, numerous challenges persist, such as asynchronization, computational expenses, data heterogeneity, and gradient and membership privacy attacks. Lately, dataset distillation has emerged as a promising solution for addressing the aforementioned challenges by generating a compact synthetic dataset that preserves a model's training efficacy. However, we discover that using distilled local datasets can amplify the heterogeneity issue in FL. To address this, we propose Federated Learning on Virtual Heterogeneous Data with Local-Global Dataset Distillation (FedLGD), where we seamlessly integrate dataset distillation algorithms into FL pipeline and train FL using a smaller synthetic dataset (referred as virtual data). Specifically, to harmonize the domain shifts, we propose iterative distribution matching to inpaint global information to local virtual data and use federated gradient matching to distill global virtual data that serve as anchor points to rectify heterogeneous local training, without compromising data privacy. We experiment on both benchmark and real-world datasets that contain heterogeneous data from different sources, and further scale up to an FL scenario that contains a large number of clients with heterogeneous and class-imbalanced data. Our method outperforms state-of-the-art heterogeneous FL algorithms under various settings. Our code is available at https://github.com/ubc-tea/FedLGD.
