Table of Contents
Fetching ...

Semi-parametric Memory Consolidation: Towards Brain-like Deep Continual Learning

Geng Liu, Fei Zhu, Rong Feng, Zhiqiang Yi, Shiqi Wang, Gaofeng Meng, Zhaoxiang Zhang

TL;DR

This study demonstrates that emulating biological intelligence provides a promising path to enable deep neural networks with continual learning capabilities and enables deep neural networks to retain high performance on novel tasks while maintaining prior knowledge in real-world challenging continual learning scenarios.

Abstract

Humans and most animals inherently possess a distinctive capacity to continually acquire novel experiences and accumulate worldly knowledge over time. This ability, termed continual learning, is also critical for deep neural networks (DNNs) to adapt to the dynamically evolving world in open environments. However, DNNs notoriously suffer from catastrophic forgetting of previously learned knowledge when trained on sequential tasks. In this work, inspired by the interactive human memory and learning system, we propose a novel biomimetic continual learning framework that integrates semi-parametric memory and the wake-sleep consolidation mechanism. For the first time, our method enables deep neural networks to retain high performance on novel tasks while maintaining prior knowledge in real-world challenging continual learning scenarios, e.g., class-incremental learning on ImageNet. This study demonstrates that emulating biological intelligence provides a promising path to enable deep neural networks with continual learning capabilities.

Semi-parametric Memory Consolidation: Towards Brain-like Deep Continual Learning

TL;DR

This study demonstrates that emulating biological intelligence provides a promising path to enable deep neural networks with continual learning capabilities and enables deep neural networks to retain high performance on novel tasks while maintaining prior knowledge in real-world challenging continual learning scenarios.

Abstract

Humans and most animals inherently possess a distinctive capacity to continually acquire novel experiences and accumulate worldly knowledge over time. This ability, termed continual learning, is also critical for deep neural networks (DNNs) to adapt to the dynamically evolving world in open environments. However, DNNs notoriously suffer from catastrophic forgetting of previously learned knowledge when trained on sequential tasks. In this work, inspired by the interactive human memory and learning system, we propose a novel biomimetic continual learning framework that integrates semi-parametric memory and the wake-sleep consolidation mechanism. For the first time, our method enables deep neural networks to retain high performance on novel tasks while maintaining prior knowledge in real-world challenging continual learning scenarios, e.g., class-incremental learning on ImageNet. This study demonstrates that emulating biological intelligence provides a promising path to enable deep neural networks with continual learning capabilities.

Paper Structure

This paper contains 19 sections, 6 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Brain-inspired continual learning system.a, An example of the continual learning scenario, where an autonomous car initially trained on urban data needs to adapt to the new environment with novel objects. b, Edge devices, robotic systems, and medical equipment could encounter dynamic tasks and data streams in operational environments. Due to constraints in storage and computational capacity, they cannot save abundant raw samples and perform model retraining frequently to adapt to emerging tasks and data streams. BrainCL addresses this problem by enabling effective continuous learning under strict storage constraints, and demonstrates near-optimal performance on diverse datasets spanning real-life, robotic vision, and medical scenarios. c, In the brain, when learning new tasks, context-separated memory is formulated with pattern separation and completion in the hippocampus, i.e., the dentate gyrus (DG) is responsible for encoding new knowledge into compact episodic representations, which can be used to construct complete memory by CA3. During sleep, episodic memories are replayed and consolidated into the cortex without any external input. d, Inspired by the above mechanism in the brain, we propose to incorporate context-separated memory and sleep replay for a continual learner to achieve the goal of learning without forgetting.
  • Figure 2: Overview of our approach.a, The memory module first processes raw samples through pattern separation to generate corresponding memory cues with low information entropy. Samples could be recalled from the memory cues through pattern completion. This operational mechanism mirrors the complementary functions of the dentate gyrus (DG) and CA3 regions in the hippocampus. b, BrainCL contains two distinct learning phases, namely the wake phase and the sleep phase. During the wake phase, samples of the new task are memorized into semi-parametric working memory, composed of non-parametric memory cues and a parametric pattern completion network. Only the classifier is trained to acquire classification abilities for the new task. During the sleep phase, working memory is selectively transferred into long-term memory, based on which the entire model is finetuned without external inputs to perform structural knowledge consolidation.
  • Figure 3: BrainCL enhances the performance of continual learning on large-scale and realistic datasets. Test accuracy for incrementally learned classes on a natural image dataset ImageNet-100, b robot vision dataset CoRe50, and c medical dataset MedMNIST, presented as means over three random seeds with shaded areas indicating $±$SEM. Average and last accuracy of class-incremental learning on d ImageNet-100, e CoRe50 and f MedMNIST datasets. d The distribution of test accuracy scores for all tasks in the g ImageNet-100, h CoRe50 and i MedMNIST, with width representing probability density and overlaid scatter points indicating individual data points.
  • Figure 4: BrainCL enhances the robustness and reliability of continual learning on more practical scenarios.a Examples from the corrupted dataset, which consists of 15 types of corruptions from noise, blur, weather, and digital categories. Test accuracy for incrementally learned classes on corrupted ImageNet-C dataset with b Gaussian noise and c Frost noise. d-e Test accuracy of 15 types of corruptions with two different levels of severity. f Average and last accuracy of Class-IL on ImageNet-C. g Distribution of long-tailed classes unknown classes that could be encountered in the open world. h Test accuracy for incrementally learned classes on ImageNet-LT. i Average and last accuracy of Class-IL on ImageNet-LT. Confidence distributions of known and unknown classes after Class-IL with j Replay and k BrainCL approaches. l Comparison of reliability metric, e.g., AUROC among different methods.
  • Figure 5: Understanding BrainCL with further analysis.a Memory comparison: BrainCL achieves strong performance with limited memory cost, while DER has a large memory due to continual backbone expansion. b Class-wise fairness across all learned classes at the end of CL. c We reveal that continual learning can cause the disparate impact of different classes in the same task (e.g., task 1), and BrainCL leads to less disparate impact. d EWC and replay lead to feature space collapse, while our BrainCL maintains the feature space as well as the oracle, i.e., joint training. Comparison of attention regions during continual learning with e EWC and f BrainCL. Confusion matrix across the 100 classes in the ImageNet-100 dataset with g EWC h Replay and i BrainCL. TSNE visualization of the continual models trained with j EWC k Replay and l BrainCL.