Table of Contents
Fetching ...

Enhancing Federated Class-Incremental Learning via Spatial-Temporal Statistics Aggregation

Zenghao Guan, Guojun Zhu, Yucan Zhou, Wu Liu, Weiping Wang, Jiebo Luo, Xiaoyan Gu

TL;DR

This work tackles Federated Class-Incremental Learning by addressing spatial-temporal drift and expensive communication costs. The proposed STSA framework aggregates feature statistics across clients and across incremental stages, enabling a closed-form ridge update $W_t=(\\boldsymbol{G}_{1:t}+\\gamma\\boldsymbol{I})^{-1}\\boldsymbol{C}_{1:t}$ with a fixed backbone after the first stage. STSA-E further reduces communication by estimating the global Gram matrix from first-order statistics, while preserving statistical performance guarantees. Extensive experiments show STSA and STSA-E outperform state-of-the-art FCIL methods, offering strong accuracy gains and substantial efficiency improvements, and they demonstrate compatibility with privacy-preserving techniques like secure aggregation. The approach holds practical potential for privacy-conscious, resource-constrained distributed continual learning settings.

Abstract

Federated Class-Incremental Learning (FCIL) enables Class-Incremental Learning (CIL) from distributed data. Existing FCIL methods typically integrate old knowledge preservation into local client training. However, these methods cannot avoid spatial-temporal client drift caused by data heterogeneity and often incur significant computational and communication overhead, limiting practical deployment. To address these challenges simultaneously, we propose a novel approach, Spatial-Temporal Statistics Aggregation (STSA), which provides a unified framework to aggregate feature statistics both spatially (across clients) and temporally (across stages). The aggregated feature statistics are unaffected by data heterogeneity and can be used to update the classifier in closed form at each stage. Additionally, we introduce STSA-E, a communication-efficient variant with theoretical guarantees, achieving similar performance to STSA-E with much lower communication overhead. Extensive experiments on three widely used FCIL datasets, with varying degrees of data heterogeneity, show that our method outperforms state-of-the-art FCIL methods in terms of performance, flexibility, and both communication and computation efficiency. The code is available at https://github.com/Yuqin-G/STSA.

Enhancing Federated Class-Incremental Learning via Spatial-Temporal Statistics Aggregation

TL;DR

This work tackles Federated Class-Incremental Learning by addressing spatial-temporal drift and expensive communication costs. The proposed STSA framework aggregates feature statistics across clients and across incremental stages, enabling a closed-form ridge update with a fixed backbone after the first stage. STSA-E further reduces communication by estimating the global Gram matrix from first-order statistics, while preserving statistical performance guarantees. Extensive experiments show STSA and STSA-E outperform state-of-the-art FCIL methods, offering strong accuracy gains and substantial efficiency improvements, and they demonstrate compatibility with privacy-preserving techniques like secure aggregation. The approach holds practical potential for privacy-conscious, resource-constrained distributed continual learning settings.

Abstract

Federated Class-Incremental Learning (FCIL) enables Class-Incremental Learning (CIL) from distributed data. Existing FCIL methods typically integrate old knowledge preservation into local client training. However, these methods cannot avoid spatial-temporal client drift caused by data heterogeneity and often incur significant computational and communication overhead, limiting practical deployment. To address these challenges simultaneously, we propose a novel approach, Spatial-Temporal Statistics Aggregation (STSA), which provides a unified framework to aggregate feature statistics both spatially (across clients) and temporally (across stages). The aggregated feature statistics are unaffected by data heterogeneity and can be used to update the classifier in closed form at each stage. Additionally, we introduce STSA-E, a communication-efficient variant with theoretical guarantees, achieving similar performance to STSA-E with much lower communication overhead. Extensive experiments on three widely used FCIL datasets, with varying degrees of data heterogeneity, show that our method outperforms state-of-the-art FCIL methods in terms of performance, flexibility, and both communication and computation efficiency. The code is available at https://github.com/Yuqin-G/STSA.

Paper Structure

This paper contains 23 sections, 60 equations, 10 figures, 6 tables, 1 algorithm.

Figures (10)

  • Figure 1: Paradigms of existing FCIL methods and ours. $w$ and $\mathcal{S}$ denote the model weights and feature statistics, respectively.
  • Figure 2: Illustration of spatial-temporal client drift through loss landscapes. $W_t$ denotes the aggregated global model.
  • Figure 3: Framework of our algorithm. $\mathcal{D}_{i,j}$ denotes the local dataset of client $j$ for task $\mathcal{T}_i$. $F$ is the feature extractor trained in the first stage and is kept fixed afterward. $R$ is the random mapping layer. In our STSA, Aggregated Spatial Statistics $\boldsymbol{\mathcal{S}}_{t}$ at stage $t$ includes both $\boldsymbol{G}_{t}$ and $\boldsymbol{C}_{t}$, whereas for STSA-E, it refers to $\boldsymbol{C}_{t}$.
  • Figure 4: Performance curve of different datasets ($\beta=0.5$) with different backbone.
  • Figure 5: Final average accuracy $A_{T}$ with different $M$. The top-performing baselines (LANDER and PILoRA) are also marked in the figure.
  • ...and 5 more figures

Theorems & Definitions (6)

  • proof
  • proof
  • proof
  • proof
  • proof
  • proof