Table of Contents
Fetching ...

Using Diffusion Models as Generative Replay in Continual Federated Learning -- What will Happen?

Yongsheng Mei, Liangqi Yuan, Dong-Jun Han, Kevin S. Chan, Christopher G. Brinton, Tian Lan

TL;DR

The approach harnesses the power of the conditional diffusion model to generate synthetic historical data at each local device during communication, effectively mitigating latent shifts in dynamic data distribution inputs.

Abstract

Federated learning (FL) has become a cornerstone in decentralized learning, where, in many scenarios, the incoming data distribution will change dynamically over time, introducing continuous learning (CL) problems. This continual federated learning (CFL) task presents unique challenges, particularly regarding catastrophic forgetting and non-IID input data. Existing solutions include using a replay buffer to store historical data or leveraging generative adversarial networks. Nevertheless, motivated by recent advancements in the diffusion model for generative tasks, this paper introduces DCFL, a novel framework tailored to address the challenges of CFL in dynamic distributed learning environments. Our approach harnesses the power of the conditional diffusion model to generate synthetic historical data at each local device during communication, effectively mitigating latent shifts in dynamic data distribution inputs. We provide the convergence bound for the proposed CFL framework and demonstrate its promising performance across multiple datasets, showcasing its effectiveness in tackling the complexities of CFL tasks.

Using Diffusion Models as Generative Replay in Continual Federated Learning -- What will Happen?

TL;DR

The approach harnesses the power of the conditional diffusion model to generate synthetic historical data at each local device during communication, effectively mitigating latent shifts in dynamic data distribution inputs.

Abstract

Federated learning (FL) has become a cornerstone in decentralized learning, where, in many scenarios, the incoming data distribution will change dynamically over time, introducing continuous learning (CL) problems. This continual federated learning (CFL) task presents unique challenges, particularly regarding catastrophic forgetting and non-IID input data. Existing solutions include using a replay buffer to store historical data or leveraging generative adversarial networks. Nevertheless, motivated by recent advancements in the diffusion model for generative tasks, this paper introduces DCFL, a novel framework tailored to address the challenges of CFL in dynamic distributed learning environments. Our approach harnesses the power of the conditional diffusion model to generate synthetic historical data at each local device during communication, effectively mitigating latent shifts in dynamic data distribution inputs. We provide the convergence bound for the proposed CFL framework and demonstrate its promising performance across multiple datasets, showcasing its effectiveness in tackling the complexities of CFL tasks.

Paper Structure

This paper contains 27 sections, 4 theorems, 16 equations, 7 figures, 5 tables, 2 algorithms.

Key Result

Lemma 1

Let Assumptions assumption:smooth to assumption:sgd_norm hold and $L, \mu, \sigma_k, G$ be defined therein. Choose $\kappa = \frac{L}{\mu}$, $\gamma = \max\{8\kappa, E\}$ and the learning rate $\eta_t = \frac{2}{\mu (\gamma+t)}$. Then FedAvg with full device participation to the optimal $F^*$ satisf where $B = \sum_{k=1}^K p_k^2 \sigma_k^2 + 6L \Gamma + 8 (E-1)^2G^2$ and $\Gamma = F^* - \sum^K_{k

Figures (7)

  • Figure 1: Three Continual Federated Learning Scenarios. Class Incremental IID: Clients have an identical class distribution, with classes incrementing over time. Class Incremental Non-IID: Clients have a non-identical class distribution, with classes incrementing over time. Domain Incremental: Clients data domain changes over time.
  • Figure 2: Proposed DCFL Framework. Each client has a target model and a diffusion model, both trained on the same dataset, consisting of the previous time period's real and synthetic data. The target model is uploaded to the server for aggregation, while the diffusion model remains local to prevent privacy leakage. The trained diffusion model will generate synthetic data encompassing all previously acquired knowledge.
  • Figure 3: Main Result - Comparison of Model Convergence with Baselines. Refer to Figure \ref{['Fig. Domain Incremental']} for the Domain Incremental scenario.
  • Figure 4: Data Distribution of PACS Dataset. The figure shows the number of samples owned by each one single client.
  • Figure 5: Main Result - Domain Incremental Scenario. Tested on the complete test set.
  • ...and 2 more figures

Theorems & Definitions (10)

  • Lemma 1: FedAvg convergence bound li2019convergence
  • proof
  • Theorem 1: Data distribution deviation measurement
  • proof
  • Lemma 2: Convergence of data generation via the diffusion model) benton2023linear
  • proof
  • Theorem 2: Convergence of DCFL
  • proof
  • proof
  • proof