Table of Contents
Fetching ...

FedCross: Towards Accurate Federated Learning via Multi-Model Cross-Aggregation

Ming Hu, Peiheng Zhou, Zhihao Yue, Zhiwei Ling, Yihao Huang, Anran Li, Yang Liu, Xiang Lian, Mingsong Chen

TL;DR

FedCross tackles gradient divergence in non-IID federated learning by introducing a multi-model cross-aggregation framework that trains multiple middleware models per round and fuses them via cross-aggregation. It replaces the single global-model paradigm with a fine-grained, multi-to-multi training scheme, including collaborative-model selection, cross-aggregation with a tunable warm-start parameter $\alpha$, and asynchronous global-model deployment. The paper provides a convergence analysis showing sublinear convergence and proposes acceleration methods (propeller models and dynamic $\alpha$) to speed up training. Extensive experiments on CIFAR-10/100, FEMNIST, Shakespeare, and Sent140 demonstrate significant accuracy improvements over FedAvg and other baselines across IID and non-IID settings, with no additional communication overhead, highlighting FedCross’s practical impact for scalable AIoT FL deployments.

Abstract

As a promising distributed machine learning paradigm, Federated Learning (FL) has attracted increasing attention to deal with data silo problems without compromising user privacy. By adopting the classic one-to-multi training scheme (i.e., FedAvg), where the cloud server dispatches one single global model to multiple involved clients, conventional FL methods can achieve collaborative model training without data sharing. However, since only one global model cannot always accommodate all the incompatible convergence directions of local models, existing FL approaches greatly suffer from inferior classification accuracy. To address this issue, we present an efficient FL framework named FedCross, which uses a novel multi-to-multi FL training scheme based on our proposed multi-model cross-aggregation approach. Unlike traditional FL methods, in each round of FL training, FedCross uses multiple middleware models to conduct weighted fusion individually. Since the middleware models used by FedCross can quickly converge into the same flat valley in terms of loss landscapes, the generated global model can achieve a well-generalization. Experimental results on various well-known datasets show that, compared with state-of-the-art FL methods, FedCross can significantly improve FL accuracy within both IID and non-IID scenarios without causing additional communication overhead.

FedCross: Towards Accurate Federated Learning via Multi-Model Cross-Aggregation

TL;DR

FedCross tackles gradient divergence in non-IID federated learning by introducing a multi-model cross-aggregation framework that trains multiple middleware models per round and fuses them via cross-aggregation. It replaces the single global-model paradigm with a fine-grained, multi-to-multi training scheme, including collaborative-model selection, cross-aggregation with a tunable warm-start parameter , and asynchronous global-model deployment. The paper provides a convergence analysis showing sublinear convergence and proposes acceleration methods (propeller models and dynamic ) to speed up training. Extensive experiments on CIFAR-10/100, FEMNIST, Shakespeare, and Sent140 demonstrate significant accuracy improvements over FedAvg and other baselines across IID and non-IID settings, with no additional communication overhead, highlighting FedCross’s practical impact for scalable AIoT FL deployments.

Abstract

As a promising distributed machine learning paradigm, Federated Learning (FL) has attracted increasing attention to deal with data silo problems without compromising user privacy. By adopting the classic one-to-multi training scheme (i.e., FedAvg), where the cloud server dispatches one single global model to multiple involved clients, conventional FL methods can achieve collaborative model training without data sharing. However, since only one global model cannot always accommodate all the incompatible convergence directions of local models, existing FL approaches greatly suffer from inferior classification accuracy. To address this issue, we present an efficient FL framework named FedCross, which uses a novel multi-to-multi FL training scheme based on our proposed multi-model cross-aggregation approach. Unlike traditional FL methods, in each round of FL training, FedCross uses multiple middleware models to conduct weighted fusion individually. Since the middleware models used by FedCross can quickly converge into the same flat valley in terms of loss landscapes, the generated global model can achieve a well-generalization. Experimental results on various well-known datasets show that, compared with state-of-the-art FL methods, FedCross can significantly improve FL accuracy within both IID and non-IID scenarios without causing additional communication overhead.
Paper Structure (38 sections, 3 theorems, 36 equations, 9 figures, 3 tables, 1 algorithm)

This paper contains 38 sections, 3 theorems, 36 equations, 9 figures, 3 tables, 1 algorithm.

Key Result

Lemma 3.4

Let $w_{r}^i= \alpha v_{r}^i + (1-\alpha)v_r^{i^\prime}$, $\alpha\in [0,1]$, and $\overline{w}_r = \sum_{i=1}^N w_{r}^i$. We have where $w^\star$ is the optimal parameters for the global loss function $F(\cdot)$. In other words, $\forall w, F^\star\leq F(w)$, where $F^\star$ denotes $F(w^\star)$.

Figures (9)

  • Figure 1: A motivating example of FedAvg and FedCross training.
  • Figure 2: The FedCross Framework.
  • Figure 3: Data distributions of selected clients with different non-IID settings.
  • Figure 4: Comparison between loss landscapes of FedAvg and FedCross.
  • Figure 5: Learning curves of different FL methods on CIFAR-10 dataset.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Lemma 3.4
  • Lemma 3.5
  • Lemma 3.6