Table of Contents
Fetching ...

Progress & Compress: A scalable framework for continual learning

Jonathan Schwarz, Jelena Luketina, Wojciech M. Czarnecki, Agnieszka Grabska-Barwinska, Yee Whye Teh, Razvan Pascanu, Raia Hadsell

TL;DR

Progress & Compress introduces a scalable continual learning framework with two fixed-size neural columns—a knowledge base and an active column—that learn tasks sequentially through progress and compress phases. Positive transfer is encouraged via lateral adapters, while prior skills are preserved in the knowledge base using an online EWC-inspired consolidation mechanism. The approach achieves competitive or superior performance across diverse domains (Omniglot, Atari, and 3D navigation) without accumulating data from past tasks or growing the network architecture. By combining distillation with a memory-efficient regularizer, it addresses catastrophic forgetting while enabling scalable, continuous learning in complex environments.

Abstract

We introduce a conceptually simple and scalable framework for continual learning domains where tasks are learned sequentially. Our method is constant in the number of parameters and is designed to preserve performance on previously encountered tasks while accelerating learning progress on subsequent problems. This is achieved by training a network with two components: A knowledge base, capable of solving previously encountered problems, which is connected to an active column that is employed to efficiently learn the current task. After learning a new task, the active column is distilled into the knowledge base, taking care to protect any previously acquired skills. This cycle of active learning (progression) followed by consolidation (compression) requires no architecture growth, no access to or storing of previous data or tasks, and no task-specific parameters. We demonstrate the progress & compress approach on sequential classification of handwritten alphabets as well as two reinforcement learning domains: Atari games and 3D maze navigation.

Progress & Compress: A scalable framework for continual learning

TL;DR

Progress & Compress introduces a scalable continual learning framework with two fixed-size neural columns—a knowledge base and an active column—that learn tasks sequentially through progress and compress phases. Positive transfer is encouraged via lateral adapters, while prior skills are preserved in the knowledge base using an online EWC-inspired consolidation mechanism. The approach achieves competitive or superior performance across diverse domains (Omniglot, Atari, and 3D navigation) without accumulating data from past tasks or growing the network architecture. By combining distillation with a memory-efficient regularizer, it addresses catastrophic forgetting while enabling scalable, continuous learning in complex environments.

Abstract

We introduce a conceptually simple and scalable framework for continual learning domains where tasks are learned sequentially. Our method is constant in the number of parameters and is designed to preserve performance on previously encountered tasks while accelerating learning progress on subsequent problems. This is achieved by training a network with two components: A knowledge base, capable of solving previously encountered problems, which is connected to an active column that is employed to efficiently learn the current task. After learning a new task, the active column is distilled into the knowledge base, taking care to protect any previously acquired skills. This cycle of active learning (progression) followed by consolidation (compression) requires no architecture growth, no access to or storing of previous data or tasks, and no task-specific parameters. We demonstrate the progress & compress approach on sequential classification of handwritten alphabets as well as two reinforcement learning domains: Atari games and 3D maze navigation.

Paper Structure

This paper contains 16 sections, 8 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Illustration of the Progress & Compress learning process. In the compress phases (C), the policy learnt most recently by the active column (green) is distilled to the knowledge base (blue) while protecting previous contents with EWC (Elastic Weight Consolidation). In the progress phases (P), new tasks are learnt by the active column while reusing features from the knowledge base via lateral, layerwise connections.
  • Figure 2: Results on Omniglot. Performance normalised by training a single model on each task. Best viewed in colour.
  • Figure 3: Positive transfer on random mazes. Shown is the learning progress on the final task after sequential training. Results averaged over 4 different final mazes. All rewards are normalised by the performance a dedicated model achieves on each task when training from scratch. Best viewed in colour.
  • Figure 4: Learning curves on Atari games. Each game is visited 5 times, allowing for training on 50m environment frames on each visit. Games are learned top to bottom left to right. Here KB: Knowledge base. Dashed vertical bars indicate re-visits to the task. Results averaged over random seeds. Best viewed in colour.
  • Figure 5: Performance retention on permuted MNIST. Shown is the test accuracy on an initial permutation (Task A) over the course of training on the remaining set of tasks (Tasks B-E).
  • ...and 2 more figures