Table of Contents
Fetching ...

Data-Free Federated Class Incremental Learning with Diffusion-Based Generative Memory

Naibo Wang, Yuchen Deng, Wenjie Feng, Jianwei Yin, See-Kiong Ng

TL;DR

The paper tackles FCIL under strict privacy by proposing a data-free approach that uses diffusion-based generative memory to mitigate catastrophic forgetting. It introduces balanced samplers and entropy-based filtering to produce stable, high-quality replay data on-device, and combines knowledge distillation with a feature-distance regularizer to stabilize knowledge transfer across tasks. Empirical results on EMNIST-Letters, CIFAR-100, and Tiny-ImageNet show consistent improvements in average accuracy and reduced forgetting, while preserving the privacy guarantees and keeping communication cost comparable to FedAvg. This approach advances FCIL by enabling robust global learning in non-IID, privacy-sensitive settings and offers practical benefits for real-world FL deployments.

Abstract

Federated Class Incremental Learning (FCIL) is a critical yet largely underexplored issue that deals with the dynamic incorporation of new classes within federated learning (FL). Existing methods often employ generative adversarial networks (GANs) to produce synthetic images to address privacy concerns in FL. However, GANs exhibit inherent instability and high sensitivity, compromising the effectiveness of these methods. In this paper, we introduce a novel data-free federated class incremental learning framework with diffusion-based generative memory (DFedDGM) to mitigate catastrophic forgetting by generating stable, high-quality images through diffusion models. We design a new balanced sampler to help train the diffusion models to alleviate the common non-IID problem in FL, and introduce an entropy-based sample filtering technique from an information theory perspective to enhance the quality of generative samples. Finally, we integrate knowledge distillation with a feature-based regularization term for better knowledge transfer. Our framework does not incur additional communication costs compared to the baseline FedAvg method. Extensive experiments across multiple datasets demonstrate that our method significantly outperforms existing baselines, e.g., over a 4% improvement in average accuracy on the Tiny-ImageNet dataset.

Data-Free Federated Class Incremental Learning with Diffusion-Based Generative Memory

TL;DR

The paper tackles FCIL under strict privacy by proposing a data-free approach that uses diffusion-based generative memory to mitigate catastrophic forgetting. It introduces balanced samplers and entropy-based filtering to produce stable, high-quality replay data on-device, and combines knowledge distillation with a feature-distance regularizer to stabilize knowledge transfer across tasks. Empirical results on EMNIST-Letters, CIFAR-100, and Tiny-ImageNet show consistent improvements in average accuracy and reduced forgetting, while preserving the privacy guarantees and keeping communication cost comparable to FedAvg. This approach advances FCIL by enabling robust global learning in non-IID, privacy-sensitive settings and offers practical benefits for real-world FL deployments.

Abstract

Federated Class Incremental Learning (FCIL) is a critical yet largely underexplored issue that deals with the dynamic incorporation of new classes within federated learning (FL). Existing methods often employ generative adversarial networks (GANs) to produce synthetic images to address privacy concerns in FL. However, GANs exhibit inherent instability and high sensitivity, compromising the effectiveness of these methods. In this paper, we introduce a novel data-free federated class incremental learning framework with diffusion-based generative memory (DFedDGM) to mitigate catastrophic forgetting by generating stable, high-quality images through diffusion models. We design a new balanced sampler to help train the diffusion models to alleviate the common non-IID problem in FL, and introduce an entropy-based sample filtering technique from an information theory perspective to enhance the quality of generative samples. Finally, we integrate knowledge distillation with a feature-based regularization term for better knowledge transfer. Our framework does not incur additional communication costs compared to the baseline FedAvg method. Extensive experiments across multiple datasets demonstrate that our method significantly outperforms existing baselines, e.g., over a 4% improvement in average accuracy on the Tiny-ImageNet dataset.
Paper Structure (23 sections, 12 equations, 4 figures, 3 tables, 1 algorithm)

This paper contains 23 sections, 12 equations, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overview of our framework. A diffusion model is trained with the help of the Balanced Sampler (II). We utilize the diffusion model to generate synthetic images, and assign labels to these images by the global model $\theta_g^{t-1}$ from the previous task. An Entropy Filter subsequently screens these samples. We combine three losses ($\mathcal{L}_{CE}^t, \mathcal{L}_{KD}^t$ and $\mathcal{L}_{FD}^t$) to train the local model $\theta_i^t$ (III).
  • Figure 2: Test average accuracy vs. the number of observed tasks for (a) CIFAR-100 on $T=5$ tasks, (b) Tiny-ImageNet on $T=5$ tasks, (c) Tiny-ImageNet on $T=10$ tasks.
  • Figure 3: (a) Average accuracy under different data distribution settings on the CIFAR-100 dataset, (b) Ablation studies on the CIFAR-100 dataset.
  • Figure 4: Real vs synthetic data generated by the diffusion model for the Tiny-ImageNet dataset.