Table of Contents
Fetching ...

Adversarial Pseudo-replay for Exemplar-free Class-incremental Learning

Hiroto Honda

TL;DR

This paper introduces adversarial pseudo-replay (APR), a method that perturbs the images of the new task with adversarial attack, to synthesize the pseudo-replay images online without storing any replay samples, achieving state-of-the-art on challenging cold-start settings of the standard EFCIL benchmarks.

Abstract

Exemplar-free class-incremental learning (EFCIL) aims to retain old knowledge acquired in the previous task while learning new classes, without storing the previous images due to storage constraints or privacy concerns. In EFCIL, the plasticity-stability dilemma, learning new tasks versus catastrophic forgetting, is a significant challenge, primarily due to the unavailability of images from earlier tasks. In this paper, we introduce adversarial pseudo-replay (APR), a method that perturbs the images of the new task with adversarial attack, to synthesize the pseudo-replay images online without storing any replay samples. During the new task training, the adversarial attack is conducted on the new task images with augmented old class mean prototypes as targets, and the resulting images are used for knowledge distillation to prevent semantic drift. Moreover, we calibrate the covariance matrices to compensate for the semantic drift after each task, by learning a transfer matrix on the pseudo-replay samples. Our method reconciles stability and plasticity, achieving state-of-the-art on challenging cold-start settings of the standard EFCIL benchmarks.

Adversarial Pseudo-replay for Exemplar-free Class-incremental Learning

TL;DR

This paper introduces adversarial pseudo-replay (APR), a method that perturbs the images of the new task with adversarial attack, to synthesize the pseudo-replay images online without storing any replay samples, achieving state-of-the-art on challenging cold-start settings of the standard EFCIL benchmarks.

Abstract

Exemplar-free class-incremental learning (EFCIL) aims to retain old knowledge acquired in the previous task while learning new classes, without storing the previous images due to storage constraints or privacy concerns. In EFCIL, the plasticity-stability dilemma, learning new tasks versus catastrophic forgetting, is a significant challenge, primarily due to the unavailability of images from earlier tasks. In this paper, we introduce adversarial pseudo-replay (APR), a method that perturbs the images of the new task with adversarial attack, to synthesize the pseudo-replay images online without storing any replay samples. During the new task training, the adversarial attack is conducted on the new task images with augmented old class mean prototypes as targets, and the resulting images are used for knowledge distillation to prevent semantic drift. Moreover, we calibrate the covariance matrices to compensate for the semantic drift after each task, by learning a transfer matrix on the pseudo-replay samples. Our method reconciles stability and plasticity, achieving state-of-the-art on challenging cold-start settings of the standard EFCIL benchmarks.

Paper Structure

This paper contains 26 sections, 12 equations, 6 figures, 12 tables, 1 algorithm.

Figures (6)

  • Figure 1: Adversarial Pseudo Replay. The images from the new task are transformed into old-task data via adversarial attack in an online manner. Local (logits-based) knowledge distillation using the pseudo-replay images and preserved network prevent the target extractor from semantic drift.
  • Figure 2: Accuracy transition across all tasks on the cold-start settings. All the results are averaged over three random seeds using our implementation. Best viewed in color.
  • Figure 3: Example of pseudo-replay samples (top: before attack, bottom: after attack) from CIFAR100 krizhevsky2009learning.
  • Figure 4: t-SNE analysis of online adversarial attacks during $t=1$ training on CIFAR100. The features extracted by $f^{t-1}$ and prototypes of two classes are shown. Best viewed in color.
  • Figure 5: Euclidean distance distributions between target prototype and image features extracted by $f^{t-1}$ before and after attack. Best viewed in color.
  • ...and 1 more figures