FedGMR: Federated Learning with Gradual Model Restoration under Asynchrony and Model Heterogeneity

Chengjie Ma; Seungeun Oh; Jihong Park; Seong-Lyun Kim

FedGMR: Federated Learning with Gradual Model Restoration under Asynchrony and Model Heterogeneity

Chengjie Ma, Seungeun Oh, Jihong Park, Seong-Lyun Kim

TL;DR

FedGMR tackles late-stage capacity bottlenecks in model-heterogeneous federated learning by gradually restoring sub-model density for bandwidth-constrained clients. It combines a two-stage density strategy with a mask-aware, buffering aggregation to maintain stable updates under asynchrony and evolving model structures, supported by convergence guarantees. The approach is validated on FEMNIST, CIFAR-10, and ImageNet-100, showing faster convergence and higher accuracy than baselines, especially under high heterogeneity and non-IID distributions. The work includes a thorough theoretical analysis of convergence under mask-aware aggregation and extensive ablations demonstrating the core role of dynamic restoration and robust aggregation in achieving robust performance. Overall, FedGMR provides a practical, theoretically-grounded framework for leveraging heterogeneous client capabilities in large-scale FL without sacrificing convergence or performance.

Abstract

Federated learning (FL) holds strong potential for distributed machine learning, but in heterogeneous environments, Bandwidth-Constrained Clients (BCCs) often struggle to participate effectively due to limited communication capacity. Their small sub-models learn quickly at first but become under-parameterized in later stages, leading to slow convergence and degraded generalization. We propose FedGMR - Federated Learning with Gradual Model Restoration under Asynchrony and Model Heterogeneity. FedGMR progressively increases each client's sub-model density during training, enabling BCCs to remain effective contributors throughout the process. In addition, we develop a mask-aware aggregation rule tailored for asynchronous MHFL and provide convergence guarantees showing that aggregated error scales with the average sub-model density across clients and rounds, while GMR provably shrinks this gap toward full-model FL. Extensive experiments on FEMNIST, CIFAR-10, and ImageNet-100 demonstrate that FedGMR achieves faster convergence and higher accuracy, especially under high heterogeneity and non-IID settings.

FedGMR: Federated Learning with Gradual Model Restoration under Asynchrony and Model Heterogeneity

TL;DR

Abstract

FedGMR: Federated Learning with Gradual Model Restoration under Asynchrony and Model Heterogeneity

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (5)