Table of Contents
Fetching ...

FedTSA: A Cluster-based Two-Stage Aggregation Method for Model-heterogeneous Federated Learning

Boyu Fan, Chenrui Wu, Xiang Su, Pan Hui

TL;DR

The paper tackles system heterogeneity in federated learning by allowing heterogeneous architectures across clients and avoiding a single global model. It introduces FedTSA, which clusters clients by resource using kernel density estimation, applies Stage 1 in-cluster weight averaging with pruning, and performs Stage 2 server-side deep mutual learning guided by data generated via diffusion with a prompt pool; the global knowledge is computed as $z_{avg}=\frac{1}{M}\sum_{r=1}^M z_r$, and updates rely on a KL divergence loss against the averaged logits on synthetic data $\mathcal{X}$. Key contributions include a KDE-based resource-aware clustering with pruning rates, a diffusion-generated data pipeline for cross-cluster knowledge transfer, and comprehensive experiments showing FedTSA outperforms baselines on CIFAR-10/100 and Tiny-ImageNet under both IID and non-IID distributions. The work demonstrates practical impact for deploying FL in heterogeneous hardware environments and provides insights on hyperparameters such as prompts, temperature $T$, and synthetic data quantity, while highlighting trade-offs related to diffusion overhead and task focus.

Abstract

Despite extensive research into data heterogeneity in federated learning (FL), system heterogeneity remains a significant yet often overlooked challenge. Traditional FL approaches typically assume homogeneous hardware resources across FL clients, implying that clients can train a global model within a comparable time frame. However, in practical FL systems, clients often have heterogeneous resources, which impacts their training capacity. This discrepancy underscores the importance of exploring model-heterogeneous FL, a paradigm allowing clients to train different models based on their resource capabilities. To address this challenge, we introduce FedTSA, a cluster-based two-stage aggregation method tailored for system heterogeneity in FL. FedTSA begins by clustering clients based on their capabilities, then performs a two-stage aggregation: conventional weight averaging for homogeneous models in Stage 1, and deep mutual learning with a diffusion model for aggregating heterogeneous models in Stage 2. Extensive experiments demonstrate that FedTSA not only outperforms the baselines but also explores various factors influencing model performance, validating FedTSA as a promising approach for model-heterogeneous FL.

FedTSA: A Cluster-based Two-Stage Aggregation Method for Model-heterogeneous Federated Learning

TL;DR

The paper tackles system heterogeneity in federated learning by allowing heterogeneous architectures across clients and avoiding a single global model. It introduces FedTSA, which clusters clients by resource using kernel density estimation, applies Stage 1 in-cluster weight averaging with pruning, and performs Stage 2 server-side deep mutual learning guided by data generated via diffusion with a prompt pool; the global knowledge is computed as , and updates rely on a KL divergence loss against the averaged logits on synthetic data . Key contributions include a KDE-based resource-aware clustering with pruning rates, a diffusion-generated data pipeline for cross-cluster knowledge transfer, and comprehensive experiments showing FedTSA outperforms baselines on CIFAR-10/100 and Tiny-ImageNet under both IID and non-IID distributions. The work demonstrates practical impact for deploying FL in heterogeneous hardware environments and provides insights on hyperparameters such as prompts, temperature , and synthetic data quantity, while highlighting trade-offs related to diffusion overhead and task focus.

Abstract

Despite extensive research into data heterogeneity in federated learning (FL), system heterogeneity remains a significant yet often overlooked challenge. Traditional FL approaches typically assume homogeneous hardware resources across FL clients, implying that clients can train a global model within a comparable time frame. However, in practical FL systems, clients often have heterogeneous resources, which impacts their training capacity. This discrepancy underscores the importance of exploring model-heterogeneous FL, a paradigm allowing clients to train different models based on their resource capabilities. To address this challenge, we introduce FedTSA, a cluster-based two-stage aggregation method tailored for system heterogeneity in FL. FedTSA begins by clustering clients based on their capabilities, then performs a two-stage aggregation: conventional weight averaging for homogeneous models in Stage 1, and deep mutual learning with a diffusion model for aggregating heterogeneous models in Stage 2. Extensive experiments demonstrate that FedTSA not only outperforms the baselines but also explores various factors influencing model performance, validating FedTSA as a promising approach for model-heterogeneous FL.
Paper Structure (17 sections, 9 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 17 sections, 9 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Left: FL in homogeneous settings. All clients share the same global model and the server performs weight averaging to aggregate them. Right: FL in heterogeneous settings. Due to the resource constraints, some clients can only train simple models, leading to heterogeneous models. As these models have different dimensions, how to aggregate them becomes the core challenge.
  • Figure 1: Comparison of image generation quality: CGAN vs. Diffusion Model.
  • Figure 2: FedTSA framework with two-stage aggregation.
  • Figure 3: Test accuracy (%) with different global epochs in the DML process.
  • Figure 4: Test accuracy (%) on CIFAR-10 among different loss functions in the DML process.
  • ...and 1 more figures