Toward Understanding Catastrophic Forgetting in Continual Learning

Cuong V. Nguyen; Alessandro Achille; Michael Lam; Tal Hassner; Vijay Mahadevan; Stefano Soatto

Toward Understanding Catastrophic Forgetting in Continual Learning

Cuong V. Nguyen, Alessandro Achille, Michael Lam, Tal Hassner, Vijay Mahadevan, Stefano Soatto

TL;DR

The paper investigates why continual learning models forget previously learned tasks by examining properties of task sequences. It introduces a general procedure that uses Task2Vec task-space embeddings to define two sequence properties—total complexity and sequential heterogeneity—and then correlates these with actual sequence hardness measured as final error. Empirically, total complexity shows a strong positive correlation with forgetting on some benchmarks (notably CIFAR-10), while sequential heterogeneity is weak or even negatively correlated in several settings, suggesting that task dissimilarity can sometimes aid continual learning. The results highlight the need to consider task complexity when designing benchmarks and algorithms and motivate customizing transfer between specific task pairs. This methodology provides a framework to study how task structure affects forgetting and could guide future improvements in benchmarks and continual learning methods.

Abstract

We study the relationship between catastrophic forgetting and properties of task sequences. In particular, given a sequence of tasks, we would like to understand which properties of this sequence influence the error rates of continual learning algorithms trained on the sequence. To this end, we propose a new procedure that makes use of recent developments in task space modeling as well as correlation analysis to specify and analyze the properties we are interested in. As an application, we apply our procedure to study two properties of a task sequence: (1) total complexity and (2) sequential heterogeneity. We show that error rates are strongly and positively correlated to a task sequence's total complexity for some state-of-the-art algorithms. We also show that, surprisingly, the error rates have no or even negative correlations in some cases to sequential heterogeneity. Our findings suggest directions for improving continual learning benchmarks and methods.

Toward Understanding Catastrophic Forgetting in Continual Learning

TL;DR

Abstract

Toward Understanding Catastrophic Forgetting in Continual Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)