Enhancing One-Shot Federated Learning Through Data and Ensemble Co-Boosting

Rong Dai; Yonggang Zhang; Ang Li; Tongliang Liu; Xun Yang; Bo Han

Enhancing One-Shot Federated Learning Through Data and Ensemble Co-Boosting

Rong Dai, Yonggang Zhang, Ang Li, Tongliang Liu, Xun Yang, Bo Han

TL;DR

This paper tackles the bottleneck of one-shot federated learning (OFL) by addressing the dependence of the server model on both the synthesized data and the ensemble of client models. It introduces Co-Boosting, an adversarial, mutually reinforcing framework where hard samples generated from the current ensemble drive reweighting of client logits, and the improved ensemble in turn yields higher-quality data for distillation to the server model $\theta_S$. Key contributions include a hard-sample generation loss $\mathcal{L}_{H}$, an adversarial loss $\mathcal{L}_{A}$, an on-the-fly diversification step for hard samples, a learnable ensemble weighting scheme $\bm{w}$ updated via a gradient-sign method, and a joint max-min objective $\min_{\bm{w}} \max_{\delta \in S} \ell_{CE}(\sum_{k} w_k f_k(x_s+\delta;\bm{\theta}_k), y_s)$. The method is architecture-agnostic and does not require changes to local training or additional data transmissions, making it practical for modern model-market scenarios. Experiments across five datasets and varied non-IID settings demonstrate substantial server accuracy gains over strong baselines, validating the efficacy and robustness of the mutual boosting paradigm.

Abstract

One-shot Federated Learning (OFL) has become a promising learning paradigm, enabling the training of a global server model via a single communication round. In OFL, the server model is aggregated by distilling knowledge from all client models (the ensemble), which are also responsible for synthesizing samples for distillation. In this regard, advanced works show that the performance of the server model is intrinsically related to the quality of the synthesized data and the ensemble model. To promote OFL, we introduce a novel framework, Co-Boosting, in which synthesized data and the ensemble model mutually enhance each other progressively. Specifically, Co-Boosting leverages the current ensemble model to synthesize higher-quality samples in an adversarial manner. These hard samples are then employed to promote the quality of the ensemble model by adjusting the ensembling weights for each client model. Consequently, Co-Boosting periodically achieves high-quality data and ensemble models. Extensive experiments demonstrate that Co-Boosting can substantially outperform existing baselines under various settings. Moreover, Co-Boosting eliminates the need for adjustments to the client's local training, requires no additional data or model transmission, and allows client models to have heterogeneous architectures.

Enhancing One-Shot Federated Learning Through Data and Ensemble Co-Boosting

TL;DR

. Key contributions include a hard-sample generation loss

, an adversarial loss

, an on-the-fly diversification step for hard samples, a learnable ensemble weighting scheme

updated via a gradient-sign method, and a joint max-min objective

. The method is architecture-agnostic and does not require changes to local training or additional data transmissions, making it practical for modern model-market scenarios. Experiments across five datasets and varied non-IID settings demonstrate substantial server accuracy gains over strong baselines, validating the efficacy and robustness of the mutual boosting paradigm.

Abstract

Paper Structure (28 sections, 13 equations, 5 figures, 19 tables, 1 algorithm)

This paper contains 28 sections, 13 equations, 5 figures, 19 tables, 1 algorithm.

Introduction
Related Works
One-shot Federated learning
Knowledge Distillation
Methodology
One-Shot Federated Learning
Boosting the data quality
Boosting the ensemble quality
Co-boosting the data and the ensemble
Experiments
Experimental Details
General Results
In-depth Study
Conclusion
More discussions about related works
...and 13 more sections

Figures (5)

Figure 1: Co-Boosting Framework and Experimental Comparison. (a) illustrates the core concept of our approach. In each epoch, high-quality samples are first generated based on last epoch's ensemble and server, which are then used to adjust client weights giving a better ensemble. Based on the enriched data and refined ensemble, server model is updated by distilling knowledge from them. (b) shows test accuracy of ensemble with averaged weights, learned weights on real data, and learned weights on hard samples. (c) shows test accuracy of server obtained through distillation on real, synthetic, and hard samples with same ensemble. (d) presents an overall comparison. DENSE zhang2022dense signifies the current state-of-the-art, FedENS denotes the averaged ensemble. Experiments in (b)(c)(d) are all done on CIFAR-10 with a 10-client $Dir(0.1)$-parted setting.
Figure 2: Test accuracy of server
Figure 3: Comparison with multi-round federated learning methods.
Figure 4: Visualization of synthetic data on MNIST dataset.
Figure 5: Visualization of synthetic data on CIFAR-10 dataset.

Enhancing One-Shot Federated Learning Through Data and Ensemble Co-Boosting

TL;DR

Abstract

Enhancing One-Shot Federated Learning Through Data and Ensemble Co-Boosting

Authors

TL;DR

Abstract

Table of Contents

Figures (5)