Table of Contents
Fetching ...

FedConv: A Learning-on-Model Paradigm for Heterogeneous Federated Clients

Leming Shen, Qiang Yang, Kaiyan Cui, Yuanqing Zheng, Xiao-Yong Wei, Jianwei Liu, Jinsong Han

TL;DR

FedConv tackles the challenge of heterogeneous federated clients by introducing a learning-on-model paradigm that compresses the global model with convolutional layers to create heterogeneous sub-models tailored to client resources. The server then uses transposed convolutional dilation to reconstruct large, unified models for aggregation and employs a learned weighted scheme, guided by server-side data, to combine personalized contributions. This tripartite design—convolutional compression, TC-based dilation, and weighted aggregation—delivers substantial improvements in global accuracy and client personalization while reducing memory, computation, and communication overhead. Empirical results across six datasets show FedConv outperforming state-of-the-art baselines by significant margins and maintaining robustness under diverse heterogeneity scenarios, highlighting its practical impact for real-world FL deployments.

Abstract

Federated Learning (FL) facilitates collaborative training of a shared global model without exposing clients' private data. In practical FL systems, clients (e.g., edge servers, smartphones, and wearables) typically have disparate system resources. Conventional FL, however, adopts a one-size-fits-all solution, where a homogeneous large global model is transmitted to and trained on each client, resulting in an overwhelming workload for less capable clients and starvation for other clients. To address this issue, we propose FedConv, a client-friendly FL framework, which minimizes the computation and memory burden on resource-constrained clients by providing heterogeneous customized sub-models. FedConv features a novel learning-on-model paradigm that learns the parameters of the heterogeneous sub-models via convolutional compression. Unlike traditional compression methods, the compressed models in FedConv can be directly trained on clients without decompression. To aggregate the heterogeneous sub-models, we propose transposed convolutional dilation to convert them back to large models with a unified size while retaining personalized information from clients. The compression and dilation processes, transparent to clients, are optimized on the server leveraging a small public dataset. Extensive experiments on six datasets demonstrate that FedConv outperforms state-of-the-art FL systems in terms of model accuracy (by more than 35% on average), computation and communication overhead (with 33% and 25% reduction, respectively).

FedConv: A Learning-on-Model Paradigm for Heterogeneous Federated Clients

TL;DR

FedConv tackles the challenge of heterogeneous federated clients by introducing a learning-on-model paradigm that compresses the global model with convolutional layers to create heterogeneous sub-models tailored to client resources. The server then uses transposed convolutional dilation to reconstruct large, unified models for aggregation and employs a learned weighted scheme, guided by server-side data, to combine personalized contributions. This tripartite design—convolutional compression, TC-based dilation, and weighted aggregation—delivers substantial improvements in global accuracy and client personalization while reducing memory, computation, and communication overhead. Empirical results across six datasets show FedConv outperforming state-of-the-art baselines by significant margins and maintaining robustness under diverse heterogeneity scenarios, highlighting its practical impact for real-world FL deployments.

Abstract

Federated Learning (FL) facilitates collaborative training of a shared global model without exposing clients' private data. In practical FL systems, clients (e.g., edge servers, smartphones, and wearables) typically have disparate system resources. Conventional FL, however, adopts a one-size-fits-all solution, where a homogeneous large global model is transmitted to and trained on each client, resulting in an overwhelming workload for less capable clients and starvation for other clients. To address this issue, we propose FedConv, a client-friendly FL framework, which minimizes the computation and memory burden on resource-constrained clients by providing heterogeneous customized sub-models. FedConv features a novel learning-on-model paradigm that learns the parameters of the heterogeneous sub-models via convolutional compression. Unlike traditional compression methods, the compressed models in FedConv can be directly trained on clients without decompression. To aggregate the heterogeneous sub-models, we propose transposed convolutional dilation to convert them back to large models with a unified size while retaining personalized information from clients. The compression and dilation processes, transparent to clients, are optimized on the server leveraging a small public dataset. Extensive experiments on six datasets demonstrate that FedConv outperforms state-of-the-art FL systems in terms of model accuracy (by more than 35% on average), computation and communication overhead (with 33% and 25% reduction, respectively).

Paper Structure

This paper contains 35 sections, 5 equations, 16 figures, 3 tables, 1 algorithm.

Figures (16)

  • Figure 1: Heterogeneous models in federated learning.
  • Figure 2: The parameter sharing and pruning scheme with limitations (the pruned part is colored blue in (c)).
  • Figure 3: Convolutional compression process.
  • Figure 4: Framework architecture of FedConv.
  • Figure 5: An example of the convolution/TC process (black arrow: forward, blue arrow: backward, blue box: larger parameters, grey box: smaller parameters, orange: Conv/TC).
  • ...and 11 more figures