Table of Contents
Fetching ...

Out-of-distribution forgetting: vulnerability of continual learning to intra-class distribution shift

Liangxuan Guo, Yang Chen, Shan Yu

TL;DR

This work identified an under-attended risk during CL, highlighting the importance of developing approaches that can overcome OODF, and verified that CL methods without dedicating subnetworks for individual tasks are all vulnerable to OODF.

Abstract

Continual learning (CL) is an important technique to allow artificial neural networks to work in open environments. CL enables a system to learn new tasks without severe interference to its performance on old tasks, i.e., overcome the problems of catastrophic forgetting. In joint learning, it is well known that the out-of-distribution (OOD) problem caused by intentional attacks or environmental perturbations will severely impair the ability of networks to generalize. In this work, we reported a special form of catastrophic forgetting raised by the OOD problem in continual learning settings, and we named it out-of-distribution forgetting (OODF). In continual image classification tasks, we found that for a given category, introducing an intra-class distribution shift significantly impaired the recognition accuracy of CL methods for that category during subsequent learning. Interestingly, this phenomenon is special for CL as the same level of distribution shift had only negligible effects in the joint learning scenario. We verified that CL methods without dedicating subnetworks for individual tasks are all vulnerable to OODF. Moreover, OODF does not depend on any specific way of shifting the distribution, suggesting it is a risk for CL in a wide range of circumstances. Taken together, our work identified an under-attended risk during CL, highlighting the importance of developing approaches that can overcome OODF. Code available: \url{https://github.com/Hiroid/OODF}

Out-of-distribution forgetting: vulnerability of continual learning to intra-class distribution shift

TL;DR

This work identified an under-attended risk during CL, highlighting the importance of developing approaches that can overcome OODF, and verified that CL methods without dedicating subnetworks for individual tasks are all vulnerable to OODF.

Abstract

Continual learning (CL) is an important technique to allow artificial neural networks to work in open environments. CL enables a system to learn new tasks without severe interference to its performance on old tasks, i.e., overcome the problems of catastrophic forgetting. In joint learning, it is well known that the out-of-distribution (OOD) problem caused by intentional attacks or environmental perturbations will severely impair the ability of networks to generalize. In this work, we reported a special form of catastrophic forgetting raised by the OOD problem in continual learning settings, and we named it out-of-distribution forgetting (OODF). In continual image classification tasks, we found that for a given category, introducing an intra-class distribution shift significantly impaired the recognition accuracy of CL methods for that category during subsequent learning. Interestingly, this phenomenon is special for CL as the same level of distribution shift had only negligible effects in the joint learning scenario. We verified that CL methods without dedicating subnetworks for individual tasks are all vulnerable to OODF. Moreover, OODF does not depend on any specific way of shifting the distribution, suggesting it is a risk for CL in a wide range of circumstances. Taken together, our work identified an under-attended risk during CL, highlighting the importance of developing approaches that can overcome OODF. Code available: \url{https://github.com/Hiroid/OODF}
Paper Structure (22 sections, 4 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 22 sections, 4 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: Illustration of out-of-distribution forgetting. There are two continual learning scenarios, the top row is a standard continual learning paradigm, while the bottom row is a continual learning paradigm with an intra-class distribution shift on task 1. At time 1 in the OODF paradigm, although the generalization of task 1 was equally good compared to the standard CL setting, the protection provided by CL methods mainly focuses on out-of-distribution samples of task 1, leading to severe deficits in performing task 1 after learning task 2.
  • Figure 2: Distribution Shift. Red rectangle box selected the pixels that were modified in (a) and (b). Figure (c) will be discussed in later section.
  • Figure 3: Properties of out-of-distribution forgetting.
  • Figure 4: Comparison of non-target tasks’ accuracies between standard and shift experiments. The results were obtained by averaging the accuracies for all tasks except for task S after the whole CL learning procedure was completed. The left (right) bars for each figure are the results for the control (shift) group.
  • Figure 5: Comparison between joint learning and CL under the same distribution shift with corresponding network backbone tested on SplitMNIST-10. In each figure, the pink bar on the leftmost of each subgroup indicates training without shifts, the green bar nearby indicates learning with shifts, and the horizontal axis listed different learning strategies.
  • ...and 2 more figures