Table of Contents
Fetching ...

Continual Lifelong Learning with Neural Networks: A Review

German I. Parisi, Ronald Kemker, Jose L. Part, Christopher Kanan, Stefan Wermter

TL;DR

Defines lifelong learning as continual acquisition from non-stationary data streams and highlights catastrophic forgetting as the central challenge. Surveys neural-network strategies to mitigate forgetting, categorized into regularization, dynamic architectures, and memory-replay/CLS-inspired approaches, with links to neuroscience concepts like the complementary learning systems. Discusses developmentally inspired methods, transfer learning, curiosity, and multisensory integration as components for autonomous agents, and emphasizes the need for robust benchmarks and evaluation protocols. Concludes that a hybrid, biologically informed approach combining multiple mechanisms offers a promising path toward scalable, autonomous lifelong learning in real-world environments.

Abstract

Humans and animals have the ability to continually acquire, fine-tune, and transfer knowledge and skills throughout their lifespan. This ability, referred to as lifelong learning, is mediated by a rich set of neurocognitive mechanisms that together contribute to the development and specialization of our sensorimotor skills as well as to long-term memory consolidation and retrieval. Consequently, lifelong learning capabilities are crucial for autonomous agents interacting in the real world and processing continuous streams of information. However, lifelong learning remains a long-standing challenge for machine learning and neural network models since the continual acquisition of incrementally available information from non-stationary data distributions generally leads to catastrophic forgetting or interference. This limitation represents a major drawback for state-of-the-art deep neural network models that typically learn representations from stationary batches of training data, thus without accounting for situations in which information becomes incrementally available over time. In this review, we critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting. We discuss well-established and emerging research motivated by lifelong learning factors in biological systems such as structural plasticity, memory replay, curriculum and transfer learning, intrinsic motivation, and multisensory integration.

Continual Lifelong Learning with Neural Networks: A Review

TL;DR

Defines lifelong learning as continual acquisition from non-stationary data streams and highlights catastrophic forgetting as the central challenge. Surveys neural-network strategies to mitigate forgetting, categorized into regularization, dynamic architectures, and memory-replay/CLS-inspired approaches, with links to neuroscience concepts like the complementary learning systems. Discusses developmentally inspired methods, transfer learning, curiosity, and multisensory integration as components for autonomous agents, and emphasizes the need for robust benchmarks and evaluation protocols. Concludes that a hybrid, biologically informed approach combining multiple mechanisms offers a promising path toward scalable, autonomous lifelong learning in real-world environments.

Abstract

Humans and animals have the ability to continually acquire, fine-tune, and transfer knowledge and skills throughout their lifespan. This ability, referred to as lifelong learning, is mediated by a rich set of neurocognitive mechanisms that together contribute to the development and specialization of our sensorimotor skills as well as to long-term memory consolidation and retrieval. Consequently, lifelong learning capabilities are crucial for autonomous agents interacting in the real world and processing continuous streams of information. However, lifelong learning remains a long-standing challenge for machine learning and neural network models since the continual acquisition of incrementally available information from non-stationary data distributions generally leads to catastrophic forgetting or interference. This limitation represents a major drawback for state-of-the-art deep neural network models that typically learn representations from stationary batches of training data, thus without accounting for situations in which information becomes incrementally available over time. In this review, we critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting. We discuss well-established and emerging research motivated by lifelong learning factors in biological systems such as structural plasticity, memory replay, curriculum and transfer learning, intrinsic motivation, and multisensory integration.

Paper Structure

This paper contains 19 sections, 8 equations, 5 figures.

Figures (5)

  • Figure 1: Schematic view of two aspects of neurosynaptic adaptation: a) Hebbian learning with homeostatic plasticity as a compensatory mechanism that uses observations to compute a feedback control signal (Adapted with permission from Zenke2017a). b) The complementary learning systems (CLS) theory McClelland1995 comprising the hippocampus for the fast learning of episodic information and the neocortex for the slow learning of structured knowledge.
  • Figure 2: Schematic view of neural network approaches for lifelong learning: a) retraining while regularizing to prevent catastrophic forgetting with previously learned tasks, b) unchanged parameters with network extension for representing new tasks, and c) selective retraining with possible expansion.
  • Figure 3: Example images from benchmark datasets used for the evaluation of lifelong learning approaches: a) the MNIST dataset with 10 digit classes LeCun1998, b) the Caltech-UCSD Birds-200 (CUB-200) dataset composed of 200 different bird species Wah2011, and c) the CORe50 containing 50 objects with variations in background, illumination, blurring, occlusion, pose, and scale (adapted with permission from Lomonaco2017).
  • Figure 4: Results of several lifelong learning approaches for incremental class learning. The mean-class test accuracy evaluated on the MNIST (a), CUB-200 (b), and AudioSet (c) is shown for the following approaches: FEL (red), MLP (yellow), GeppNet (green), GeppNet+STM (blue), EWC (pink), and offline model (dashed line). Adapted with permission from Kemker2018a.
  • Figure 5: Schematic view of the main components for the development of autonomous agents able to learn over long periods of time in complex environments: Developmental and curriculum learning (Sec. 4.2), transfer learning (Sec. 4.3), curiosity and intrinsic motivation (Sec. 4.4), and crossmodal learning (Sec. 4.5).