Towards Optimal Customized Architecture for Heterogeneous Federated Learning with Contrastive Cloud-Edge Model Decoupling

Xingyan Chen; Tian Du; Mu Wang; Tiancheng Gu; Yu Zhao; Gang Kou; Changqiao Xu; Dapeng Oliver Wu

Towards Optimal Customized Architecture for Heterogeneous Federated Learning with Contrastive Cloud-Edge Model Decoupling

Xingyan Chen, Tian Du, Mu Wang, Tiancheng Gu, Yu Zhao, Gang Kou, Changqiao Xu, Dapeng Oliver Wu

TL;DR

The paper tackles non-IID data heterogeneity in federated learning by proposing FedCMD, a cloud-edge framework that decouples models into a shared body $\omega$ and a personalized head $\phi$, and dynamically selects the personalized layer $l^*$ using a Wasserstein-based feature distribution transfer metric. It introduces a two-phase approach: (1) personalized layer selection via a contrastive layer selection mechanism and (2) heterogeneous FL with a weighted aggregation guided by layer similarities, maintaining $\phi_i$ locally while updating $\omega$ across clients. The authors demonstrate, through extensive experiments on ten benchmarks and comparisons to nine baselines, that FedCMD achieves superior accuracy and robustness to non-IID data, with favorable scalability and manageable communication overhead. The work advances personalized layer selection by quantifying cross-client data distribution shifts and provides practical algorithms and complexity analyses, offering meaningful improvements for real-world cloud-edge federated systems.

Abstract

Federated learning, as a promising distributed learning paradigm, enables collaborative training of a global model across multiple network edge clients without the need for central data collecting. However, the heterogeneity of edge data distribution drags the model towards the local minima, which can be distant from the global optimum. Such heterogeneity often leads to slow convergence and substantial communication overhead. To address these issues, we propose a novel federated learning framework called FedCMD, a model decoupling tailored to the Cloud-edge supported federated learning that separates deep neural networks into a body for capturing shared representations in Cloud and a personalized head for migrating data heterogeneity. Our motivation is that, by the deep investigation of the performance of selecting different neural network layers as the personalized head, we found rigidly assigning the last layer as the personalized head in current studies is not always optimal. Instead, it is necessary to dynamically select the personalized layer that maximizes the training performance by taking the representation difference between neighbor layers into account. To find the optimal personalized layer, we utilize the low-dimensional representation of each layer to contrast feature distribution transfer and introduce a Wasserstein-based layer selection method, aimed at identifying the best-match layer for personalization. Additionally, a weighted global aggregation algorithm is proposed based on the selected personalized layer for the practical application of FedCMD. Extensive experiments on ten benchmarks demonstrate the efficiency and superior performance of our solution compared with nine state-of-the-art solutions. All code and results are available at https://github.com/elegy112138/FedCMD.

Towards Optimal Customized Architecture for Heterogeneous Federated Learning with Contrastive Cloud-Edge Model Decoupling

TL;DR

The paper tackles non-IID data heterogeneity in federated learning by proposing FedCMD, a cloud-edge framework that decouples models into a shared body

and a personalized head

, and dynamically selects the personalized layer

using a Wasserstein-based feature distribution transfer metric. It introduces a two-phase approach: (1) personalized layer selection via a contrastive layer selection mechanism and (2) heterogeneous FL with a weighted aggregation guided by layer similarities, maintaining

locally while updating

across clients. The authors demonstrate, through extensive experiments on ten benchmarks and comparisons to nine baselines, that FedCMD achieves superior accuracy and robustness to non-IID data, with favorable scalability and manageable communication overhead. The work advances personalized layer selection by quantifying cross-client data distribution shifts and provides practical algorithms and complexity analyses, offering meaningful improvements for real-world cloud-edge federated systems.

Abstract

Paper Structure (41 sections, 8 equations, 8 figures, 13 tables, 3 algorithms)

This paper contains 41 sections, 8 equations, 8 figures, 13 tables, 3 algorithms.

Introduction
Related Works
Heterogeneous Federated Learning
Model Decoupling
Motivation
Performance comparison with different personalized layer
Multi-layer personalization evaluation
Heterogeneity of clients in personalized layer selection
Preliminaries
System Model
Federated Learning with Model Decoupling
Federated learning workflow
Model decoupling
Wasserstein Distance
Framework with contrastive layer selection mechanism
...and 26 more sections

Figures (8)

Figure 1: Model decoupling based on Central Kernel Alignment.
Figure 2: Performance comparison of the mixed personalized layer approach and the fc2 personalized layer method across ten datasets.
Figure 3: The framework of FedCMD with dynamic personalized layer selection.
Figure 4: The feature distribution of different layers for LeNet5 lecun1998gradient
Figure 5: Accuracy comparison of FedCMD, FedCMD without weighted aggregation, and original FedAvg across ten datasets.
...and 3 more figures

Towards Optimal Customized Architecture for Heterogeneous Federated Learning with Contrastive Cloud-Edge Model Decoupling

TL;DR

Abstract

Towards Optimal Customized Architecture for Heterogeneous Federated Learning with Contrastive Cloud-Edge Model Decoupling

Authors

TL;DR

Abstract

Table of Contents

Figures (8)