Table of Contents
Fetching ...

Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference

Matthew Riemer, Ignacio Cases, Robert Ajemian, Miao Liu, Irina Rish, Yuhai Tu, Gerald Tesauro

TL;DR

The paper reframes continual learning as a temporally symmetric transfer-interference trade-off and introduces Meta-Experience Replay (MER), a method that blends experience replay with optimization-based meta-learning to promote gradient alignment across past and future data. By dynamically shaping weight sharing via meta-updates, MER increases transfer while suppressing interference, achieving stronger retained performance than baselines on both supervised lifelong learning benchmarks and non-stationary continual RL tasks. MER demonstrates robustness to small replay buffers and shows that gradient dynamics shift toward transfer rather than interference as learning progresses. This approach offers a general, scalable strategy for preventing forgetting in non-stationary environments without explicit task delineations.

Abstract

Lack of performance when it comes to continual learning over non-stationary distributions of data remains a major challenge in scaling neural network learning to more human realistic settings. In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples. We then propose a new algorithm, Meta-Experience Replay (MER), that directly exploits this view by combining experience replay with optimization based meta-learning. This method learns parameters that make interference based on future gradients less likely and transfer based on future gradients more likely. We conduct experiments across continual lifelong supervised learning benchmarks and non-stationary reinforcement learning environments demonstrating that our approach consistently outperforms recently proposed baselines for continual learning. Our experiments show that the gap between the performance of MER and baseline algorithms grows both as the environment gets more non-stationary and as the fraction of the total experiences stored gets smaller.

Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference

TL;DR

The paper reframes continual learning as a temporally symmetric transfer-interference trade-off and introduces Meta-Experience Replay (MER), a method that blends experience replay with optimization-based meta-learning to promote gradient alignment across past and future data. By dynamically shaping weight sharing via meta-updates, MER increases transfer while suppressing interference, achieving stronger retained performance than baselines on both supervised lifelong learning benchmarks and non-stationary continual RL tasks. MER demonstrates robustness to small replay buffers and shows that gradient dynamics shift toward transfer rather than interference as learning progresses. This approach offers a general, scalable strategy for preventing forgetting in non-stationary environments without explicit task delineations.

Abstract

Lack of performance when it comes to continual learning over non-stationary distributions of data remains a major challenge in scaling neural network learning to more human realistic settings. In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples. We then propose a new algorithm, Meta-Experience Replay (MER), that directly exploits this view by combining experience replay with optimization based meta-learning. This method learns parameters that make interference based on future gradients less likely and transfer based on future gradients more likely. We conduct experiments across continual lifelong supervised learning benchmarks and non-stationary reinforcement learning environments demonstrating that our approach consistently outperforms recently proposed baselines for continual learning. Our experiments show that the gap between the performance of MER and baseline algorithms grows both as the environment gets more non-stationary and as the fraction of the total experiences stored gets smaller.

Paper Structure

This paper contains 28 sections, 39 equations, 6 figures, 9 tables, 8 algorithms.

Figures (6)

  • Figure 1: A) The stability-plasticity dilemma considers plasticity with respect to the current learning and how it degrades old learning. The transfer-interference trade-off considers the stability-plasticity dilemma and its dependence on weight sharing in both forward and backward directions. This symmetric view is crucial as solutions that purely focus on reducing the degree of weight-sharing are unlikely to produce transfer in the future. B) A depiction of transfer in weight space. C) A depiction of interference in weight space.
  • Figure 2: Left: a sequence of frames from Catcher and Flappy Bird respectively. The goal in Catcher is to capture the falling pellet by moving the racket on the bottom of the screen. In Flappy Bird, the goal is to navigate the bird through as many pipes as possible by making it go up or letting it fall. Right: average score in Catcher (above) and Flappy Bird (below) for evaluation on the first task which has slower falling pellets and a larger pipe gap.
  • Figure 3: Further details on Omniglot performance characteristics for each model.
  • Figure 4: Continual learning performance for a non-stationary version of Catcher. Graphs show averaged values over ten validation episodes across five different seeds. Vertical grid lines on the x-axis indicate a task switch.
  • Figure 5: Continual learning for a non-stationary version of Flappy Bird.
  • ...and 1 more figures