Table of Contents
Fetching ...

One-Shot Heterogeneous Federated Learning with Local Model-Guided Diffusion Models

Mingzhao Yang, Shangchao Su, Bin Li, Xiangyang Xue

TL;DR

The paper tackles the practicality gap in one-shot federated learning with diffusion models by removing the need for foundation models on clients. It introduces FedLMG, which uses locally trained client models to guide a server diffusion model via classification and BN-based losses, generating synthetic data that align with client distributions. The approach enables three aggregation strategies (Fine-tuning, Multi-teacher Distillation, Specific-teacher Distillation) and provides theoretical KL-divergence bounds to connect server-generated data with client distributions. Empirical results on OpenImage, DomainNet, and NICO++ show FedLMG often surpasses traditional FL baselines and, in some cases, even centralized training ceilings, underscoring the potential of diffusion models in practical OSFL for heterogeneous clients.

Abstract

In recent years, One-shot Federated Learning methods based on Diffusion Models have garnered increasing attention due to their remarkable performance. However, most of these methods require the deployment of foundation models on client devices, which significantly raises the computational requirements and reduces their adaptability to heterogeneous client models compared to traditional FL methods. In this paper, we propose FedLMG, a heterogeneous one-shot Federated learning method with Local Model-Guided diffusion models. Briefly speaking, in FedLMG, clients do not need access to any foundation models but only train and upload their local models, which is consistent with traditional FL methods. On the clients, we employ classification loss and BN loss to capture the broad category features and detailed contextual features of the client distributions. On the server, based on the uploaded client models, we utilize backpropagation to guide the server's DM in generating synthetic datasets that comply with the client distributions, which are then used to train the aggregated model. By using the locally trained client models as a medium to transfer client knowledge, our method significantly reduces the computational requirements on client devices and effectively adapts to scenarios with heterogeneous clients. Extensive quantitation and visualization experiments on three large-scale real-world datasets, along with theoretical analysis, demonstrate that the synthetic datasets generated by FedLMG exhibit comparable quality and diversity to the client datasets, which leads to an aggregated model that outperforms all compared methods and even the performance ceiling, further elucidating the significant potential of utilizing DMs in FL.

One-Shot Heterogeneous Federated Learning with Local Model-Guided Diffusion Models

TL;DR

The paper tackles the practicality gap in one-shot federated learning with diffusion models by removing the need for foundation models on clients. It introduces FedLMG, which uses locally trained client models to guide a server diffusion model via classification and BN-based losses, generating synthetic data that align with client distributions. The approach enables three aggregation strategies (Fine-tuning, Multi-teacher Distillation, Specific-teacher Distillation) and provides theoretical KL-divergence bounds to connect server-generated data with client distributions. Empirical results on OpenImage, DomainNet, and NICO++ show FedLMG often surpasses traditional FL baselines and, in some cases, even centralized training ceilings, underscoring the potential of diffusion models in practical OSFL for heterogeneous clients.

Abstract

In recent years, One-shot Federated Learning methods based on Diffusion Models have garnered increasing attention due to their remarkable performance. However, most of these methods require the deployment of foundation models on client devices, which significantly raises the computational requirements and reduces their adaptability to heterogeneous client models compared to traditional FL methods. In this paper, we propose FedLMG, a heterogeneous one-shot Federated learning method with Local Model-Guided diffusion models. Briefly speaking, in FedLMG, clients do not need access to any foundation models but only train and upload their local models, which is consistent with traditional FL methods. On the clients, we employ classification loss and BN loss to capture the broad category features and detailed contextual features of the client distributions. On the server, based on the uploaded client models, we utilize backpropagation to guide the server's DM in generating synthetic datasets that comply with the client distributions, which are then used to train the aggregated model. By using the locally trained client models as a medium to transfer client knowledge, our method significantly reduces the computational requirements on client devices and effectively adapts to scenarios with heterogeneous clients. Extensive quantitation and visualization experiments on three large-scale real-world datasets, along with theoretical analysis, demonstrate that the synthetic datasets generated by FedLMG exhibit comparable quality and diversity to the client datasets, which leads to an aggregated model that outperforms all compared methods and even the performance ceiling, further elucidating the significant potential of utilizing DMs in FL.
Paper Structure (15 sections, 13 equations, 4 figures, 4 tables)

This paper contains 15 sections, 13 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of FedLMG. Our method consists of three steps: Local Training, Image Generation, and Model Aggregation. Firstly, each client independently trains their models using its private data and uploads them to the server. Assisted by these client models, our method leverages the powerful DM to obtain the synthetic dataset that complies with different client distributions. Based on the synthetic dataset, three strategies are provided to obtain the aggregated model.
  • Figure 2: The visualization of generated samples on different datasets.
  • Figure 3: The visualization about the effects of different loss functions.
  • Figure 4: The visualization of privacy-sensitive information-related categories.