WarmFed: Federated Learning with Warm-Start for Globalization and Personalization Via Personalized Diffusion Models
Tao Feng, Jie Zhang, Xiangjian Li, Rong Huang, Huashan Liu, Zhijie Wang
TL;DR
This work addresses privacy-preserving Federated Learning by balancing globalization and personalization without sacrificing client data privacy. It introduces WarmFed, which starts from a pre-trained initialization and creates client-specific warm-start diffusion models via LoRA-fine-tuning, transmitting only compact parameter matrices. At the server, synthetic data generated from these models supports globalization through targeted fine-tuning, while Dynamic Self-Distillation selects personalized knowledge to distill into the global model, enhancing personalization. The approach delivers strong performance in one-shot and five-round communications across diverse datasets with low transmission costs and robust privacy, offering a practical pathway to jointly global and personalized FL.
Abstract
Federated Learning (FL) stands as a prominent distributed learning paradigm among multiple clients to achieve a unified global model without privacy leakage. In contrast to FL, Personalized federated learning aims at serving for each client in achieving persoanlized model. However, previous FL frameworks have grappled with a dilemma: the choice between developing a singular global model at the server to bolster globalization or nurturing personalized model at the client to accommodate personalization. Instead of making trade-offs, this paper commences its discourse from the pre-trained initialization, obtaining resilient global information and facilitating the development of both global and personalized models. Specifically, we propose a novel method called WarmFed to achieve this. WarmFed customizes Warm-start through personalized diffusion models, which are generated by local efficient fine-tunining (LoRA). Building upon the Warm-Start, we advance a server-side fine-tuning strategy to derive the global model, and propose a dynamic self-distillation (DSD) to procure more resilient personalized models simultaneously. Comprehensive experiments underscore the substantial gains of our approach across both global and personalized models, achieved within just one-shot and five communication(s).
