Table of Contents
Fetching ...

Complementary Learning for Overcoming Catastrophic Forgetting Using Experience Replay

Mohammad Rostami, Soheil Kolouri, Praveen K. Pilly

TL;DR

This work tackles catastrophic forgetting in sequential multitask learning by embedding all tasks into a shared discriminative space and using a generative autoencoder to produce pseudo-data for experience replay. By modeling a common embedding distribution with a GMM and aligning current task representations to this distribution via a sliced Wasserstein discrepancy, the method mitigates forgetting without storing past data. Theoretical bounds grounded in optimal transport support the approach, and empirical results on permuted MNIST and related-domain digit tasks demonstrate reduced forgetting and effective knowledge integration across tasks. The CLEER framework offers a memory-efficient alternative to full replay and complements weight-consolidation methods for robust lifelong learning.

Abstract

Despite huge success, deep networks are unable to learn effectively in sequential multitask learning settings as they forget the past learned tasks after learning new tasks. Inspired from complementary learning systems theory, we address this challenge by learning a generative model that couples the current task to the past learned tasks through a discriminative embedding space. We learn an abstract level generative distribution in the embedding that allows the generation of data points to represent the experience. We sample from this distribution and utilize experience replay to avoid forgetting and simultaneously accumulate new knowledge to the abstract distribution in order to couple the current task with past experience. We demonstrate theoretically and empirically that our framework learns a distribution in the embedding that is shared across all task and as a result tackles catastrophic forgetting.

Complementary Learning for Overcoming Catastrophic Forgetting Using Experience Replay

TL;DR

This work tackles catastrophic forgetting in sequential multitask learning by embedding all tasks into a shared discriminative space and using a generative autoencoder to produce pseudo-data for experience replay. By modeling a common embedding distribution with a GMM and aligning current task representations to this distribution via a sliced Wasserstein discrepancy, the method mitigates forgetting without storing past data. Theoretical bounds grounded in optimal transport support the approach, and empirical results on permuted MNIST and related-domain digit tasks demonstrate reduced forgetting and effective knowledge integration across tasks. The CLEER framework offers a memory-efficient alternative to full replay and complements weight-consolidation methods for robust lifelong learning.

Abstract

Despite huge success, deep networks are unable to learn effectively in sequential multitask learning settings as they forget the past learned tasks after learning new tasks. Inspired from complementary learning systems theory, we address this challenge by learning a generative model that couples the current task to the past learned tasks through a discriminative embedding space. We learn an abstract level generative distribution in the embedding that allows the generation of data points to represent the experience. We sample from this distribution and utilize experience replay to avoid forgetting and simultaneously accumulate new knowledge to the abstract distribution in order to couple the current task with past experience. We demonstrate theoretically and empirically that our framework learns a distribution in the embedding that is shared across all task and as a result tackles catastrophic forgetting.

Paper Structure

This paper contains 10 sections, 7 equations, 5 figures, 1 algorithm.

Figures (5)

  • Figure 1: Architecture of the proposed framework.
  • Figure 2: Performance results and for permuted MNIST tasks. (Best viewed in color.)
  • Figure 3: UMAP visualization of CLEER versus FR for MNIST permutation tasks. (Best viewed in color.)
  • Figure 4: Performance results on MNIST and USPS digit recognition tasks. (Best viewed in color.)
  • Figure 5: UMAP visualization for $\mathcal{M}\rightarrow \mathcal{U}$ and $\mathcal{U}\rightarrow \mathcal{M}$ tasks. (Best viewed in color.)