Bridging Generalization Gap of Heterogeneous Federated Clients Using Generative Models
Ziru Niu, Hai Dong, A. K. Qin
TL;DR
This paper tackles the generalization gap in Federated Learning when clients have heterogeneous model architectures and limited ability to share parameters. It introduces FedVTC, a data-free framework in which clients exchange feature-distribution statistics and train a Variational Transposed Convolution to generate synthetic data for fine-tuning local models, thereby improving unseen-data generalization without parameter aggregation. A Distribution Matching loss regularizes the VTC to produce high-quality samples, and an alternating training scheme keeps memory usage low. Experiments on MNIST, CIFAR, and Tiny-ImageNet show FedVTC outperforms existing model-heterogeneous FL methods in generalization, while reducing communication and memory overhead and preserving privacy through non-reversible prototypes and synthetic data generation.
Abstract
Federated Learning (FL) is a privacy-preserving machine learning framework facilitating collaborative training across distributed clients. However, its performance is often compromised by data heterogeneity among participants, which can result in local models with limited generalization capability. Traditional model-homogeneous approaches address this issue primarily by regularizing local training procedures or dynamically adjusting client weights during aggregation. Nevertheless, these methods become unsuitable in scenarios involving clients with heterogeneous model architectures. In this paper, we propose a model-heterogeneous FL framework that enhances clients' generalization performance on unseen data without relying on parameter aggregation. Instead of model parameters, clients share feature distribution statistics (mean and covariance) with the server. Then each client trains a variational transposed convolutional neural network using Gaussian latent variables sampled from these distributions, and use it to generate synthetic data. By fine-tuning local models with the synthetic data, clients achieve significant improvement of generalization ability. Experimental results demonstrate that our approach not only attains higher generalization accuracy compared to existing model-heterogeneous FL frameworks, but also reduces communication costs and memory consumption.
