Table of Contents
Fetching ...

Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning

Jinglin Liang, Jin Zhong, Hanlin Gu, Zhongqi Lu, Xingxing Tang, Gang Dai, Shuangping Huang, Lixin Fan, Qiang Yang

TL;DR

Federated Class Continual Learning (FCCL) suffers catastrophic forgetting under privacy constraints that limit experience replay. The authors introduce Diffusion-Driven Data Replay (DDDR), a two-phase framework combining Federated Class Inversion (FCI) with a frozen pre-trained Latent Diffusion Model to learn compact class embeddings and generate high-quality replay data, followed by Replay-Augmented Training that fuses generated and real data with contrastive learning and knowledge distillation. By transmitting only class embeddings and leveraging FedAvg for aggregation, DDDR achieves state-of-the-art performance on CIFAR-100 and Tiny-ImageNet across IID and non-IID settings, while reducing data leakage risk and computational overhead. The approach also shows robust generalization to real and generated data domains, suggesting practical viability for privacy-preserving FCCL deployments.

Abstract

Federated Class Continual Learning (FCCL) merges the challenges of distributed client learning with the need for seamless adaptation to new classes without forgetting old ones. The key challenge in FCCL is catastrophic forgetting, an issue that has been explored to some extent in Continual Learning (CL). However, due to privacy preservation requirements, some conventional methods, such as experience replay, are not directly applicable to FCCL. Existing FCCL methods mitigate forgetting by generating historical data through federated training of GANs or data-free knowledge distillation. However, these approaches often suffer from unstable training of generators or low-quality generated data, limiting their guidance for the model. To address this challenge, we propose a novel method of data replay based on diffusion models. Instead of training a diffusion model, we employ a pre-trained conditional diffusion model to reverse-engineer each class, searching the corresponding input conditions for each class within the model's input space, significantly reducing computational resources and time consumption while ensuring effective generation. Furthermore, we enhance the classifier's domain generalization ability on generated and real data through contrastive learning, indirectly improving the representational capability of generated data for real data. Comprehensive experiments demonstrate that our method significantly outperforms existing baselines. Code is available at https://github.com/jinglin-liang/DDDR.

Diffusion-Driven Data Replay: A Novel Approach to Combat Forgetting in Federated Class Continual Learning

TL;DR

Federated Class Continual Learning (FCCL) suffers catastrophic forgetting under privacy constraints that limit experience replay. The authors introduce Diffusion-Driven Data Replay (DDDR), a two-phase framework combining Federated Class Inversion (FCI) with a frozen pre-trained Latent Diffusion Model to learn compact class embeddings and generate high-quality replay data, followed by Replay-Augmented Training that fuses generated and real data with contrastive learning and knowledge distillation. By transmitting only class embeddings and leveraging FedAvg for aggregation, DDDR achieves state-of-the-art performance on CIFAR-100 and Tiny-ImageNet across IID and non-IID settings, while reducing data leakage risk and computational overhead. The approach also shows robust generalization to real and generated data domains, suggesting practical viability for privacy-preserving FCCL deployments.

Abstract

Federated Class Continual Learning (FCCL) merges the challenges of distributed client learning with the need for seamless adaptation to new classes without forgetting old ones. The key challenge in FCCL is catastrophic forgetting, an issue that has been explored to some extent in Continual Learning (CL). However, due to privacy preservation requirements, some conventional methods, such as experience replay, are not directly applicable to FCCL. Existing FCCL methods mitigate forgetting by generating historical data through federated training of GANs or data-free knowledge distillation. However, these approaches often suffer from unstable training of generators or low-quality generated data, limiting their guidance for the model. To address this challenge, we propose a novel method of data replay based on diffusion models. Instead of training a diffusion model, we employ a pre-trained conditional diffusion model to reverse-engineer each class, searching the corresponding input conditions for each class within the model's input space, significantly reducing computational resources and time consumption while ensuring effective generation. Furthermore, we enhance the classifier's domain generalization ability on generated and real data through contrastive learning, indirectly improving the representational capability of generated data for real data. Comprehensive experiments demonstrate that our method significantly outperforms existing baselines. Code is available at https://github.com/jinglin-liang/DDDR.
Paper Structure (33 sections, 8 equations, 10 figures, 6 tables)

This paper contains 33 sections, 8 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Overview of the DDDR Framework. (a) Federated Class Inversion Phase, in which a pre-trained diffusion model is utilized to reverse-engineer an embedding for each class. This embedding serves as a condensed representation of all images within the class, efficiently encapsulating the essence of the class in a compact vector. (b) Replay-Augmented Training Phase, in which clients employ the diffusion model along with previously obtained embeddings to regenerate data. Subsequently, clients train classifiers using the generated data and real data from new tasks.
  • Figure 1: Variations in the average accuracy of DDDR across different noise intensities, on the Cifar-100 dataset with 5 tasks and non-IID data distribution. $\sigma_c$ and $\sigma_g$ denote the standard deviations of Gaussian noise introduced to classifier parameters and class embeddings, respectively. (a) With $\sigma_g$ set to 0, observing the effect of $\sigma_c$ on average accuracy. (b) With $\sigma_c$ set to 0, examining the impact of $\sigma_g$ on average accuracy. (c) Introducing noise to both class embeddings and classifier parameters to assess their collective influence on average accuracy.
  • Figure 2: Demonstrating the Local Class Inversion using the tractor class as an example. Initially, a tractor image is sampled from the client's local dataset and fed into the encoder $\mathcal{E}$, yielding the latent code. Concurrently, a frozen prompt's word embedding is concatenated with a learnable class embedding to form a guiding condition. This condition, along with the noise-added latent code, is inputted into the diffusion model to calculate the loss. The class embedding is then optimized using this loss.
  • Figure 2: Showcase of DDDR-generated samples under different noise intensities. $\sigma_g$ denotes the standard deviation of noise added to the class embeddings uploaded by clients.
  • Figure 3: Details of the variation in average accuracy as the learned task number increases across different methods on the CIFAR-100 and TinyImageNet datasets with a non-IID distribution of data.
  • ...and 5 more figures