Federated Class-Incremental Learning with New-Class Augmented Self-Distillation

Zhiyuan Wu; Tianliu He; Sheng Sun; Yuwei Wang; Min Liu; Bo Gao; Xuefeng Jiang

Federated Class-Incremental Learning with New-Class Augmented Self-Distillation

Zhiyuan Wu, Tianliu He, Sheng Sun, Yuwei Wang, Min Liu, Bo Gao, Xuefeng Jiang

TL;DR

The paper tackles catastrophic forgetting in federated class-incremental learning (FCIL) where data volume and class diversity grow over time. It proposes FedCLASS, a method that augments historical old-class logits with current new-class predictions to form a scale-aware self-distillation target, optimized via a joint loss $J^k_{Aug}=J_{CE}^k + \beta J_{KD-Aug}^k$. The authors provide a theoretical framework with assumptions and a theorem showing the augmented distillation aligns with conditional-probability modeling for old and new classes, establishing soundness. Empirically, FedCLASS achieves superior global accuracy and lower forgetting rates across multiple datasets and task settings, outperforming FedAvg and several FCIL baselines. This work enables more robust, privacy-preserving knowledge transfer in FL under evolving class distributions and memory constraints, with potential extensions to larger task horizons and long-term forgetting mitigation.

Abstract

Federated Learning (FL) enables collaborative model training among participants while guaranteeing the privacy of raw data. Mainstream FL methodologies overlook the dynamic nature of real-world data, particularly its tendency to grow in volume and diversify in classes over time. This oversight results in FL methods suffering from catastrophic forgetting, where the trained models inadvertently discard previously learned information upon assimilating new data. In response to this challenge, we propose a novel Federated Class-Incremental Learning (FCIL) method, named \underline{Fed}erated \underline{C}lass-Incremental \underline{L}earning with New-Class \underline{A}ugmented \underline{S}elf-Di\underline{S}tillation (FedCLASS). The core of FedCLASS is to enrich the class scores of historical models with new class scores predicted by current models and utilize the combined knowledge for self-distillation, enabling a more sufficient and precise knowledge transfer from historical models to current models. Theoretical analyses demonstrate that FedCLASS stands on reliable foundations, considering scores of old classes predicted by historical models as conditional probabilities in the absence of new classes, and the scores of new classes predicted by current models as the conditional probabilities of class scores derived from historical models. Empirical experiments demonstrate the superiority of FedCLASS over four baseline algorithms in reducing average forgetting rate and boosting global accuracy.

Federated Class-Incremental Learning with New-Class Augmented Self-Distillation

TL;DR

. The authors provide a theoretical framework with assumptions and a theorem showing the augmented distillation aligns with conditional-probability modeling for old and new classes, establishing soundness. Empirically, FedCLASS achieves superior global accuracy and lower forgetting rates across multiple datasets and task settings, outperforming FedAvg and several FCIL baselines. This work enables more robust, privacy-preserving knowledge transfer in FL under evolving class distributions and memory constraints, with potential extensions to larger task horizons and long-term forgetting mitigation.

Abstract

Paper Structure (13 sections, 30 equations, 2 figures, 2 tables, 2 algorithms)

This paper contains 13 sections, 30 equations, 2 figures, 2 tables, 2 algorithms.

Introduction
Related Work
Federated Class-Incremental Learning
Federated Learning with Knowledge Distillation
Problem Definition
Methodology
New-Class Augmented Self-Distillation in Federated Class-Incremental Learning
Theoretical Analyses of FedCLASS
Experiments
Implementation Details
Performance Comparison
Ablation Studies
Conclusion and Future Work

Figures (2)

Figure 1: An overview of federated class-incremental learning. The private data of clients arrives continuously according to a series of incremental tasks, each introducing additional data that incorporates new classes.
Figure 2: Illustration of new-class augmented self-distillation on client $k$.

Federated Class-Incremental Learning with New-Class Augmented Self-Distillation

TL;DR

Abstract

Federated Class-Incremental Learning with New-Class Augmented Self-Distillation

Authors

TL;DR

Abstract

Table of Contents

Figures (2)