Few-Shot Class-Incremental Learning with Non-IID Decentralized Data

Cuiwei Liu; Siang Xu; Huaijun Qiu; Jing Zhang; Zhi Liu; Liang Zhao

Few-Shot Class-Incremental Learning with Non-IID Decentralized Data

Cuiwei Liu, Siang Xu, Huaijun Qiu, Jing Zhang, Zhi Liu, Liang Zhao

TL;DR

This article introduces federated FSCIL, a decentralized machine learning paradigm tailored to progressively learn new classes from scarce data distributed across multiple clients, and presents a synthetic data-driven (SDD) framework that leverages replay buffer data to maintain existing knowledge and facilitate the acquisition of new knowledge.

Abstract

Few-shot class-incremental learning is crucial for developing scalable and adaptive intelligent systems, as it enables models to acquire new classes with minimal annotated data while safeguarding the previously accumulated knowledge. Nonetheless, existing methods deal with continuous data streams in a centralized manner, limiting their applicability in scenarios that prioritize data privacy and security. To this end, this paper introduces federated few-shot class-incremental learning, a decentralized machine learning paradigm tailored to progressively learn new classes from scarce data distributed across multiple clients. In this learning paradigm, clients locally update their models with new classes while preserving data privacy, and then transmit the model updates to a central server where they are aggregated globally. However, this paradigm faces several issues, such as difficulties in few-shot learning, catastrophic forgetting, and data heterogeneity. To address these challenges, we present a synthetic data-driven framework that leverages replay buffer data to maintain existing knowledge and facilitate the acquisition of new knowledge. Within this framework, a noise-aware generative replay module is developed to fine-tune local models with a balance of new and replay data, while generating synthetic data of new classes to further expand the replay buffer for future tasks. Furthermore, a class-specific weighted aggregation strategy is designed to tackle data heterogeneity by adaptively aggregating class-specific parameters based on local models performance on synthetic data. This enables effective global model optimization without direct access to client data. Comprehensive experiments across three widely-used datasets underscore the effectiveness and preeminence of the introduced framework.

Few-Shot Class-Incremental Learning with Non-IID Decentralized Data

TL;DR

Abstract

Paper Structure (28 sections, 18 equations, 9 figures, 5 tables, 2 algorithms)

This paper contains 28 sections, 18 equations, 9 figures, 5 tables, 2 algorithms.

Introduction
related work
Few-shot class-incremental learning
Federated continual learning
Data-free replay method
Method
Problem formulation
The overview framework
Few-shot class-incremental learning
Data generation
Optimization of the generator
Optimization of the student model
Model aggregation
Baseline approach
Experiments
...and 13 more sections

Figures (9)

Figure 1: Illustration of the F2SCIL paradigm. In each incremental session, clients update their local models to gain new classes from scarce annotated data. The updated parameters are transmitted to a server where they are consolidated into a global model, which is then redistributed to clients for the subsequent task.
Figure 2: Illustration of the proposed SDD framework. In the base session, an initial model is trained with abundant data of base classes centrally. In an incremental session, clients employ the NAGR module to adapt to new classes. A conditional generator is trained based on client-side local models to produce synthetic data, which is then used to assess the performance of these models. This process creates a class-specific weight matrix for model aggregation. Continuously generated synthetic data are kept in a replay buffer to retain old knowledge for incremental learning in future sessions.
Figure 3: An illustration of training a conditional generator. We employ a teacher-student architecture to jointly optimize the conditional generator and student model through an adversarial learning mechanism, with guidance from the teacher model. The red and green lines denote the training loss for updating the generator and the student model, respectively.
Figure 4: An illustration of the synthetic data and decision boundaries of both teacher and student models. Left Panel: Without the loss $\mathcal{L}_{KL}$, the generated synthetic data (red circles) are far from the teacher's decision boundary. Right Panel: With $\mathcal{L}_{KL}$, the generator produces more challenging synthetic data (blue circles) closer to the decision boundary, aiding the student in better mimicking the teacher's decision boundary.
Figure 5: Performance of the proposed SDD with different numbers of synthetic replay samples on miniImageNet.
...and 4 more figures

Few-Shot Class-Incremental Learning with Non-IID Decentralized Data

TL;DR

Abstract

Few-Shot Class-Incremental Learning with Non-IID Decentralized Data

Authors

TL;DR

Abstract

Table of Contents

Figures (9)