Table of Contents
Fetching ...

Bridging Generalization Gap of Heterogeneous Federated Clients Using Generative Models

Ziru Niu, Hai Dong, A. K. Qin

TL;DR

This paper tackles the generalization gap in Federated Learning when clients have heterogeneous model architectures and limited ability to share parameters. It introduces FedVTC, a data-free framework in which clients exchange feature-distribution statistics and train a Variational Transposed Convolution to generate synthetic data for fine-tuning local models, thereby improving unseen-data generalization without parameter aggregation. A Distribution Matching loss regularizes the VTC to produce high-quality samples, and an alternating training scheme keeps memory usage low. Experiments on MNIST, CIFAR, and Tiny-ImageNet show FedVTC outperforms existing model-heterogeneous FL methods in generalization, while reducing communication and memory overhead and preserving privacy through non-reversible prototypes and synthetic data generation.

Abstract

Federated Learning (FL) is a privacy-preserving machine learning framework facilitating collaborative training across distributed clients. However, its performance is often compromised by data heterogeneity among participants, which can result in local models with limited generalization capability. Traditional model-homogeneous approaches address this issue primarily by regularizing local training procedures or dynamically adjusting client weights during aggregation. Nevertheless, these methods become unsuitable in scenarios involving clients with heterogeneous model architectures. In this paper, we propose a model-heterogeneous FL framework that enhances clients' generalization performance on unseen data without relying on parameter aggregation. Instead of model parameters, clients share feature distribution statistics (mean and covariance) with the server. Then each client trains a variational transposed convolutional neural network using Gaussian latent variables sampled from these distributions, and use it to generate synthetic data. By fine-tuning local models with the synthetic data, clients achieve significant improvement of generalization ability. Experimental results demonstrate that our approach not only attains higher generalization accuracy compared to existing model-heterogeneous FL frameworks, but also reduces communication costs and memory consumption.

Bridging Generalization Gap of Heterogeneous Federated Clients Using Generative Models

TL;DR

This paper tackles the generalization gap in Federated Learning when clients have heterogeneous model architectures and limited ability to share parameters. It introduces FedVTC, a data-free framework in which clients exchange feature-distribution statistics and train a Variational Transposed Convolution to generate synthetic data for fine-tuning local models, thereby improving unseen-data generalization without parameter aggregation. A Distribution Matching loss regularizes the VTC to produce high-quality samples, and an alternating training scheme keeps memory usage low. Experiments on MNIST, CIFAR, and Tiny-ImageNet show FedVTC outperforms existing model-heterogeneous FL methods in generalization, while reducing communication and memory overhead and preserving privacy through non-reversible prototypes and synthetic data generation.

Abstract

Federated Learning (FL) is a privacy-preserving machine learning framework facilitating collaborative training across distributed clients. However, its performance is often compromised by data heterogeneity among participants, which can result in local models with limited generalization capability. Traditional model-homogeneous approaches address this issue primarily by regularizing local training procedures or dynamically adjusting client weights during aggregation. Nevertheless, these methods become unsuitable in scenarios involving clients with heterogeneous model architectures. In this paper, we propose a model-heterogeneous FL framework that enhances clients' generalization performance on unseen data without relying on parameter aggregation. Instead of model parameters, clients share feature distribution statistics (mean and covariance) with the server. Then each client trains a variational transposed convolutional neural network using Gaussian latent variables sampled from these distributions, and use it to generate synthetic data. By fine-tuning local models with the synthetic data, clients achieve significant improvement of generalization ability. Experimental results demonstrate that our approach not only attains higher generalization accuracy compared to existing model-heterogeneous FL frameworks, but also reduces communication costs and memory consumption.

Paper Structure

This paper contains 24 sections, 6 equations, 8 figures, 6 tables, 1 algorithm.

Figures (8)

  • Figure 1: A comparison between: (a): a standard FL client and (b): a client in FedVTC. In (b), "$\oplus$" and "$\odot$" respectively standard for element-wise addition and product.
  • Figure 2: VTC is trained with loss function $\mathcal{L}_{e}=\mathcal{L}_{rc} + D_{KL}$. $\mathcal{L}_{rc}$ (blue) is the reconstruction loss between the original samples and the generated samples, and $D_{KL}$ (red) is the KL-divergence between the distributions of the local and global latent variables.
  • Figure 3: FedVTC (green) has an equivalent or less memory requirement compared with others.
  • Figure 4: A fraction of raw images (top) and synthetic images (bottom) on MNIST and CIFAR10.
  • Figure 5: The sub-model aggregation scheme for sub-model-based model-heterogeneous FL ("WA" stands for weighted average).
  • ...and 3 more figures