DFDG: Data-Free Dual-Generator Adversarial Distillation for One-Shot Federated Learning

Kangyang Luo; Shuai Wang; Yexuan Fu; Renrong Shao; Xiang Li; Yunshi Lan; Ming Gao; Jinlong Shu

DFDG: Data-Free Dual-Generator Adversarial Distillation for One-Shot Federated Learning

Kangyang Luo, Shuai Wang, Yexuan Fu, Renrong Shao, Xiang Li, Yunshi Lan, Ming Gao, Jinlong Shu

TL;DR

This work tackles one-shot federated learning under data and model heterogeneity without relying on public data. It introduces DFDG, a data-free framework that trains dual conditional generators in an adversarial loop and uses dual-model distillation to update a global model, balancing fidelity, transferability, diversity, and cross-divergence to cover a broader local training space. Empirical results across multiple image datasets show significant gains over state-of-the-art baselines, validating the effectiveness of dual generators and cross-divergence learning in improving generalization in heterogeneous one-shot FL. The approach highlights practical potential for communication-efficient, privacy-conscious collaboration, with noted considerations for privacy safeguards and scalability to more generators in future work.

Abstract

Federated Learning (FL) is a distributed machine learning scheme in which clients jointly participate in the collaborative training of a global model by sharing model information rather than their private datasets. In light of concerns associated with communication and privacy, one-shot FL with a single communication round has emerged as a de facto promising solution. However, existing one-shot FL methods either require public datasets, focus on model homogeneous settings, or distill limited knowledge from local models, making it difficult or even impractical to train a robust global model. To address these limitations, we propose a new data-free dual-generator adversarial distillation method (namely DFDG) for one-shot FL, which can explore a broader local models' training space via training dual generators. DFDG is executed in an adversarial manner and comprises two parts: dual-generator training and dual-model distillation. In dual-generator training, we delve into each generator concerning fidelity, transferability and diversity to ensure its utility, and additionally tailor the cross-divergence loss to lessen the overlap of dual generators' output spaces. In dual-model distillation, the trained dual generators work together to provide the training data for updates of the global model. At last, our extensive experiments on various image classification tasks show that DFDG achieves significant performance gains in accuracy compared to SOTA baselines.

DFDG: Data-Free Dual-Generator Adversarial Distillation for One-Shot Federated Learning

TL;DR

Abstract

Paper Structure (18 sections, 7 equations, 14 figures, 10 tables, 4 algorithms)

This paper contains 18 sections, 7 equations, 14 figures, 10 tables, 4 algorithms.

Introduction
Related Work
Notations
The Proposed Method
Dual-Generator Training
Dual-Model Distillation
Experiments
Experimental Settings
Results Comparison
Ablation Study
Discussion
Conclusion
Acknowledgments
Pseudocodes
Datasets
...and 3 more sections

Figures (14)

Figure 1: A sketch of real data space ($\Omega$) as well as output spaces ($O_1$, $O_2$) of dual generators ($G_1$, $G_2$) that mimic $\Omega$ using data-free knowledge distillation.
Figure 2: Illustration of DFDG. DFDG works on the server and contains two phases, dual-generator training and dual-model distillation, where $\mathcal{L}_{gen}$ and $\mathcal{L}_{dmd}$ are the loss objectives of the dual conditional generators and the global model, respectively.
Figure 3: The sketch of synthetic data and decision boundaries of global model (student) and ensemble model (teacher). (a): synthetic data (red circles) are far away from the decision boundary $d_T$. (b): synthetic data (black circles) near the decision boundaries $d_T$. (c): synthetic data (yellow and purple circles) cross over the decision boundary $d_T$.
Figure 4: Accuracy curves selected of DFDG and baselines on FMNIST and CIFAR-10.
Figure 5: Accuracy curves selected of DFDG and baselines on SVHN and CINIC-10.
...and 9 more figures

DFDG: Data-Free Dual-Generator Adversarial Distillation for One-Shot Federated Learning

TL;DR

Abstract

DFDG: Data-Free Dual-Generator Adversarial Distillation for One-Shot Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (14)