Eidetic Learning: an Efficient and Provable Solution to Catastrophic Forgetting
Nicholas Dronen, Randall Balestriero
TL;DR
Catastrophic forgetting remains a major hurdle for sequential task learning in neural networks. Eidetic Learning builds EideticNets that guarantee immunity to forgetting by iteratively pruning and freezing important neurons per task and recycling unimportant ones, enabling nested feature reuse without rehearsal. The approach supports common architectures, provides data-conditional routing through per-task heads, and uses a task classifier at inference time to select the appropriate head, achieving competitive results on Permuted MNIST, sequential CIFAR-100, and Imagenette with linear time/space characteristics. Overall, EideticNets offer a principled, scalable solution with strong practical impact for continual learning while outlining clear avenues for extending to backward transfer and more challenging class-incremental settings.
Abstract
Catastrophic forgetting -- the phenomenon of a neural network learning a task t1 and losing the ability to perform it after being trained on some other task t2 -- is a long-standing problem for neural networks [McCloskey and Cohen, 1989]. We present a method, Eidetic Learning, that provably solves catastrophic forgetting. A network trained with Eidetic Learning -- here, an EideticNet -- requires no rehearsal or replay. We consider successive discrete tasks and show how at inference time an EideticNet automatically routes new instances without auxiliary task information. An EideticNet bears a family resemblance to the sparsely-gated Mixture-of-Experts layer Shazeer et al. [2016] in that network capacity is partitioned across tasks and the network itself performs data-conditional routing. An EideticNet is easy to implement and train, is efficient, and has time and space complexity linear in the number of parameters. The guarantee of our method holds for normalization layers of modern neural networks during both pre-training and fine-tuning. We show with a variety of network architectures and sets of tasks that EideticNets are immune to forgetting. While the practical benefits of EideticNets are substantial, we believe they can be benefit practitioners and theorists alike. The code for training EideticNets is available at https://github.com/amazon-science/eideticnet-training.
