Table of Contents
Fetching ...

Federated Generative Learning with Foundation Models

Jie Zhang, Xiaohua Qi, Bo Zhao

TL;DR

Federated Generative Learning (FGL) reframes federated training by exporting privacy-friendly text embeddings from clients to a server-equipped foundation diffusion model, which then synthesizes a substitute training set for centralized model training. This approach reduces communication rounds, mitigates data heterogeneity, and provides strong privacy assurances, demonstrated across 12 diverse datasets including ImageNet subsets, DomainNet, medical, and satellite data. The paper shows that one-shot training on synthetic data can outperform traditional FedAvg with hundreds of rounds in many settings, while limited multi-round variants with synthetic-data fine-tuning further boost performance for highly skewed distributions. Overall, FGL offers a practical path to scalable, privacy-preserving FL by leveraging prompt-driven data synthesis on powerful foundation models, with thorough ablations on prompts, generators, and data-domain challenges.

Abstract

Existing approaches in Federated Learning (FL) mainly focus on sending model parameters or gradients from clients to a server. However, these methods are plagued by significant inefficiency, privacy, and security concerns. Thanks to the emerging foundation generative models, we propose a novel federated learning framework, namely Federated Generative Learning. In this framework, each client can create text embeddings that are tailored to their local data, and send embeddings to the server. Then the informative training data can be synthesized remotely on the server using foundation generative models with these embeddings, which can benefit FL tasks. Our proposed framework offers several advantages, including increased communication efficiency, robustness to data heterogeneity, substantial performance improvements, and enhanced privacy protection. We validate these benefits through extensive experiments conducted on 12 datasets. For example, on the ImageNet100 dataset with a highly skewed data distribution, our method outperforms FedAvg by 12% in a single communication round, compared to FedAvg's performance over 200 communication rounds. We have released the code for all experiments conducted in this study.

Federated Generative Learning with Foundation Models

TL;DR

Federated Generative Learning (FGL) reframes federated training by exporting privacy-friendly text embeddings from clients to a server-equipped foundation diffusion model, which then synthesizes a substitute training set for centralized model training. This approach reduces communication rounds, mitigates data heterogeneity, and provides strong privacy assurances, demonstrated across 12 diverse datasets including ImageNet subsets, DomainNet, medical, and satellite data. The paper shows that one-shot training on synthetic data can outperform traditional FedAvg with hundreds of rounds in many settings, while limited multi-round variants with synthetic-data fine-tuning further boost performance for highly skewed distributions. Overall, FGL offers a practical path to scalable, privacy-preserving FL by leveraging prompt-driven data synthesis on powerful foundation models, with thorough ablations on prompts, generators, and data-domain challenges.

Abstract

Existing approaches in Federated Learning (FL) mainly focus on sending model parameters or gradients from clients to a server. However, these methods are plagued by significant inefficiency, privacy, and security concerns. Thanks to the emerging foundation generative models, we propose a novel federated learning framework, namely Federated Generative Learning. In this framework, each client can create text embeddings that are tailored to their local data, and send embeddings to the server. Then the informative training data can be synthesized remotely on the server using foundation generative models with these embeddings, which can benefit FL tasks. Our proposed framework offers several advantages, including increased communication efficiency, robustness to data heterogeneity, substantial performance improvements, and enhanced privacy protection. We validate these benefits through extensive experiments conducted on 12 datasets. For example, on the ImageNet100 dataset with a highly skewed data distribution, our method outperforms FedAvg by 12% in a single communication round, compared to FedAvg's performance over 200 communication rounds. We have released the code for all experiments conducted in this study.
Paper Structure (40 sections, 1 equation, 14 figures, 9 tables, 1 algorithm)

This paper contains 40 sections, 1 equation, 14 figures, 9 tables, 1 algorithm.

Figures (14)

  • Figure 1: For datasets such as subsets of ImageNet or DomainNet, our proposed method can achieve superior accuracy with only a single round of communication. In scenarios involving inherently challenging domains, including medical datasets and satellite imagery, our approach can still attain comparable performance with only five rounds of communication.
  • Figure 2: Training pipeline of FGL. Firstly, the text embeddings from clients are uploaded and then aggregated on the server. Then, stable diffusion is used to generate synthetic data to train the global model. Finally, the updated model are distributed to all clients. In the subfigure on the right, we present a detailed process of generating text embeddings.
  • Figure 3: Accuracy gap when loading a pretrained model or training from scratch.
  • Figure 4: Accuracy on highly skewed data.
  • Figure 5: Varying synthetic data volume.
  • ...and 9 more figures