Continual Learning via Neural Pruning

Siavash Golkar; Michael Kagan; Kyunghyun Cho

Continual Learning via Neural Pruning

Siavash Golkar, Michael Kagan, Kyunghyun Cho

TL;DR

This work tackles continual learning under fixed capacity constraints by introducing CLNP, which uses activation-based neural pruning to create unused capacity within a network. Subsequent tasks are trained in the inactive regions without deteriorating performance on earlier tasks, and the authors formalize graceful forgetting to balance sparsity and accuracy. They provide simple diagnostics for remaining free neurons and reused features, and demonstrate significant performance gains over weight-elastic baselines on permuted MNIST and CIFAR-10/100, with evidence that early-layer features transfer more readily. The approach offers a scalable, SGD-friendly pathway to fixed-capacity lifelong learning, with insights into when and where to widen networks to sustain many tasks.

Abstract

We introduce Continual Learning via Neural Pruning (CLNP), a new method aimed at lifelong learning in fixed capacity models based on neuronal model sparsification. In this method, subsequent tasks are trained using the inactive neurons and filters of the sparsified network and cause zero deterioration to the performance of previous tasks. In order to deal with the possible compromise between model sparsity and performance, we formalize and incorporate the concept of graceful forgetting: the idea that it is preferable to suffer a small amount of forgetting in a controlled manner if it helps regain network capacity and prevents uncontrolled loss of performance during the training of future tasks. CLNP also provides simple continual learning diagnostic tools in terms of the number of free neurons left for the training of future tasks as well as the number of neurons that are being reused. In particular, we see in experiments that CLNP verifies and automatically takes advantage of the fact that the features of earlier layers are more transferable. We show empirically that CLNP leads to significantly improved results over current weight elasticity based methods.

Continual Learning via Neural Pruning

TL;DR

Abstract

Continual Learning via Neural Pruning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)