GPT-FL: Generative Pre-trained Model-Assisted Federated Learning

Tuo Zhang; Tiantian Feng; Samiul Alam; Dimitrios Dimitriadis; Sunwoo Lee; Mi Zhang; Shrikanth S. Narayanan; Salman Avestimehr

GPT-FL: Generative Pre-trained Model-Assisted Federated Learning

Tuo Zhang, Tiantian Feng, Samiul Alam, Dimitrios Dimitriadis, Sunwoo Lee, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr

TL;DR

GPT-FL presents a decoupled federated learning framework that uses prompts from label names to generate diversified synthetic data via pre-trained generative models, trains a downstream model on the server with this data, and then federates fine-tuning with private client data. The approach achieves superior accuracy, lower communication costs, and improved client-sampling efficiency across image and audio modalities, while remaining compatible with secure aggregation and requiring no extra FL hyperparameters. Theoretical analysis shows synthetic-data pre-training biases gradients toward the synthetic distribution, reducing variance and accelerating convergence, with empirical results corroborating faster training and better generalization. Overall, GPT-FL offers a practical, versatile enhancement to FL by leveraging foundation models for data augmentation and server-side pre-training, applicable across diverse data modalities and tasks.

Abstract

In this work, we propose GPT-FL, a generative pre-trained model-assisted federated learning (FL) framework. At its core, GPT-FL leverages generative pre-trained models to generate diversified synthetic data. These generated data are used to train a downstream model on the server, which is then fine-tuned with private client data under the standard FL framework. We show that GPT-FL consistently outperforms state-of-the-art FL methods in terms of model test accuracy, communication efficiency, and client sampling efficiency. Through comprehensive ablation analysis across various data modalities, we discover that the downstream model generated by synthetic data plays a crucial role in controlling the direction of gradient diversity during FL training, which enhances convergence speed and contributes to the notable accuracy boost observed with GPT-FL. Also, regardless of whether the target data falls within or outside the domain of the pre-trained generative model, GPT-FL consistently achieves significant performance gains, surpassing the results obtained by models trained solely with FL or synthetic data. The code is available at https://github.com/AvestimehrResearchGroup/GPT-FL.

GPT-FL: Generative Pre-trained Model-Assisted Federated Learning

TL;DR

Abstract

Paper Structure (27 sections, 4 equations, 10 figures, 13 tables, 1 algorithm)

This paper contains 27 sections, 4 equations, 10 figures, 13 tables, 1 algorithm.

Introduction
Related Work
GPT-FL: Generative Pre-Trained Model-Assisted Federated Learning
Create Prompts based on Label Names
Generate Synthetic Data from Generative Pre-trained Model
Train Downstream Model on Generated Synthetic Data
Finetune Trained Downstream Model on Private Client Data with FL
Connection to Theory
Experiments
Performance Comparison with State-of-the-Art FL Methods
Understanding GPT-FL
Conclusion
Appendix
Algorithm Overview
Integration of Invertible Bloom Lookup Tables (IBLT) to Enhance Label Privacy
...and 12 more sections

Figures (10)

Figure 1: Overview of the proposed GPT-FL Framework.
Figure 2: Communication costs of standard FL methods, public data-based methods and GPT-FL to achieve the target test accuracy.
Figure 3: Communication costs of generated data-based methods and GPT-FL to achieve the target test accuracy.
Figure 4: Test accuracy of GPT-FL for CIFAR-10/100 under different client sampling rates.
Figure 5: Impact of synthetic data sample number to the generated downstream model.
...and 5 more figures

GPT-FL: Generative Pre-trained Model-Assisted Federated Learning

TL;DR

Abstract

GPT-FL: Generative Pre-trained Model-Assisted Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (10)