Table of Contents
Fetching ...

Overcoming Catastrophic Forgetting in Federated Class-Incremental Learning via Federated Global Twin Generator

Thinh Nguyen, Khoa D Doan, Binh T. Nguyen, Danh Le-Phuoc, Kok-Seng Wong

TL;DR

The paper tackles catastrophic forgetting in Federated Class-Incremental Learning under strict privacy constraints by proposing FedGTG, a framework where the server trains data and feature generators on the global model and shares them with clients to synthesize information about previous classes. Client-side training combines synthetic logits distillation, real+synthetic fine-tuning, and a novel Empirical Feature Matrix Loss to balance stability and plasticity, enabling continual learning without storing client data. Empirical results on Sequential F-CIFAR-10, Sequential F-CIFAR-100, and Sequential F-tiny-ImageNet show improved Average Incremental Accuracy and reduced forgetting, along with robust performance under natural corruptions, flatter minima, better calibration, and resilience to varying client counts. The approach demonstrates practical impact by delivering privacy-preserving, scalable continual learning in decentralized settings with strong empirical performance and analyzed under realistic robustness and calibration criteria.

Abstract

Federated Class-Incremental Learning (FCIL) increasingly becomes important in the decentralized setting, where it enables multiple participants to collaboratively train a global model to perform well on a sequence of tasks without sharing their private data. In FCIL, conventional Federated Learning algorithms such as FedAVG often suffer from catastrophic forgetting, resulting in significant performance declines on earlier tasks. Recent works, based on generative models, produce synthetic images to help mitigate this issue across all classes, but these approaches' testing accuracy on previous classes is still much lower than recent classes, i.e., having better plasticity than stability. To overcome these issues, this paper presents Federated Global Twin Generator (FedGTG), an FCIL framework that exploits privacy-preserving generative-model training on the global side without accessing client data. Specifically, the server trains a data generator and a feature generator to create two types of information from all seen classes, and then it sends the synthetic data to the client side. The clients then use feature-direction-controlling losses to make the local models retain knowledge and learn new tasks well. We extensively analyze the robustness of FedGTG on natural images, as well as its ability to converge to flat local minima and achieve better-predicting confidence (calibration). Experimental results on CIFAR-10, CIFAR-100, and tiny-ImageNet demonstrate the improvements in accuracy and forgetting measures of FedGTG compared to previous frameworks.

Overcoming Catastrophic Forgetting in Federated Class-Incremental Learning via Federated Global Twin Generator

TL;DR

The paper tackles catastrophic forgetting in Federated Class-Incremental Learning under strict privacy constraints by proposing FedGTG, a framework where the server trains data and feature generators on the global model and shares them with clients to synthesize information about previous classes. Client-side training combines synthetic logits distillation, real+synthetic fine-tuning, and a novel Empirical Feature Matrix Loss to balance stability and plasticity, enabling continual learning without storing client data. Empirical results on Sequential F-CIFAR-10, Sequential F-CIFAR-100, and Sequential F-tiny-ImageNet show improved Average Incremental Accuracy and reduced forgetting, along with robust performance under natural corruptions, flatter minima, better calibration, and resilience to varying client counts. The approach demonstrates practical impact by delivering privacy-preserving, scalable continual learning in decentralized settings with strong empirical performance and analyzed under realistic robustness and calibration criteria.

Abstract

Federated Class-Incremental Learning (FCIL) increasingly becomes important in the decentralized setting, where it enables multiple participants to collaboratively train a global model to perform well on a sequence of tasks without sharing their private data. In FCIL, conventional Federated Learning algorithms such as FedAVG often suffer from catastrophic forgetting, resulting in significant performance declines on earlier tasks. Recent works, based on generative models, produce synthetic images to help mitigate this issue across all classes, but these approaches' testing accuracy on previous classes is still much lower than recent classes, i.e., having better plasticity than stability. To overcome these issues, this paper presents Federated Global Twin Generator (FedGTG), an FCIL framework that exploits privacy-preserving generative-model training on the global side without accessing client data. Specifically, the server trains a data generator and a feature generator to create two types of information from all seen classes, and then it sends the synthetic data to the client side. The clients then use feature-direction-controlling losses to make the local models retain knowledge and learn new tasks well. We extensively analyze the robustness of FedGTG on natural images, as well as its ability to converge to flat local minima and achieve better-predicting confidence (calibration). Experimental results on CIFAR-10, CIFAR-100, and tiny-ImageNet demonstrate the improvements in accuracy and forgetting measures of FedGTG compared to previous frameworks.
Paper Structure (42 sections, 13 equations, 9 figures, 3 tables)

This paper contains 42 sections, 13 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Illustration of the real-world scenarios in the FCIL setting.
  • Figure 2: Confusion matrix among FCIL algorithms: (a) TARGET, (b) MFCL, (c) only the application of two generators to FL, and (d) FedGTG, testing on CIFAR-10 after training is completed. While TARGET and MFCL have bad predicting performance on initial classes and two generators struggle to learn new tasks, FedGTG achieves a better stability-plasticity trade-off.
  • Figure 3: Illustration of the proposed framework. After completing one task, the server employs a data-free approach to train two generators. The clients then use two types of synthetic information from these generators to train their local models for retaining knowledge and learning new tasks well.
  • Figure 4: Average Accuracy per task of various algorithms on popular benchmarks.
  • Figure 5: Robustness to natural corruptions.
  • ...and 4 more figures