Wake-Sleep Consolidated Learning

Amelia Sorrenti; Giovanni Bellitto; Federica Proietto Salanitri; Matteo Pennisi; Simone Palazzo; Concetto Spampinato

Wake-Sleep Consolidated Learning

Amelia Sorrenti, Giovanni Bellitto, Federica Proietto Salanitri, Matteo Pennisi, Simone Palazzo, Concetto Spampinato

TL;DR

Wake-Sleep Consolidated Learning (WSCL) addresses continual learning in visual classification by integrating Complementary Learning Systems with wake-sleep brain dynamics. It uses a wake phase for fast adaptation and episodic memory collection, followed by a sleep phase consisting of NREM-based consolidation and REM-based dreaming with a separate dreaming dataset. Empirical results on CIFAR-10, Tiny-ImageNet, and FG-ImageNet show WSCL delivers strong improvements in final accuracy and enables positive forward transfer, outperforming baselines and prior CLS-inspired methods. The findings highlight the practical value of mimicking offline brain states for robust continual learning and suggest avenues for more realistic memory and dream modeling.

Abstract

We propose Wake-Sleep Consolidated Learning (WSCL), a learning strategy leveraging Complementary Learning System theory and the wake-sleep phases of the human brain to improve the performance of deep neural networks for visual classification tasks in continual learning settings. Our method learns continually via the synchronization between distinct wake and sleep phases. During the wake phase, the model is exposed to sensory input and adapts its representations, ensuring stability through a dynamic parameter freezing mechanism and storing episodic memories in a short-term temporary memory (similarly to what happens in the hippocampus). During the sleep phase, the training process is split into NREM and REM stages. In the NREM stage, the model's synaptic weights are consolidated using replayed samples from the short-term and long-term memory and the synaptic plasticity mechanism is activated, strengthening important connections and weakening unimportant ones. In the REM stage, the model is exposed to previously-unseen realistic visual sensory experience, and the dreaming process is activated, which enables the model to explore the potential feature space, thus preparing synapses to future knowledge. We evaluate the effectiveness of our approach on three benchmark datasets: CIFAR-10, Tiny-ImageNet and FG-ImageNet. In all cases, our method outperforms the baselines and prior work, yielding a significant performance gain on continual visual classification tasks. Furthermore, we demonstrate the usefulness of all processing stages and the importance of dreaming to enable positive forward transfer.

Wake-Sleep Consolidated Learning

TL;DR

Abstract

Paper Structure (12 sections, 13 equations, 5 figures, 4 tables)

This paper contains 12 sections, 13 equations, 5 figures, 4 tables.

Introduction
Related Work
Method
Wake phase
Sleep phase
Experimental Evaluation
Benchmarks
Training procedure
Results
Model Analysis
Conclusion
Acknowledgements

Figures (5)

Figure 1: Wake-Sleep Consolidated Learning: in the wake stage, the model (which emulates the neocortex) fast adapts to the new sensory experience, storing episodic memories (as in the hippocampus) in the short-term memory to be replayed during sleep. The sleep phase foresees two alternating processes: 1) the NREM stage, where the DNN model consolidates its synapses based on the replayed (recent and past) samples and the long-term memory is updated; 2) the REM stage, where the DNN is trained with dreamed samples to prepare the model for future sensory inputs.
Figure 2: Impact of dreaming quality, in terms of noise (left) and image resolution (right). Results refer to ER-ACE and DER++ with WSCL (solid lines) and without it (dotted line).
Figure 3: Impact of dreaming dataset dimension. Results refer to ER-ACE and DER++ with WSCL (solid lines) and without it (dotted line).
Figure 4: WSCL model efficiency. Left: the most frequent automatically learned freezing scheme (values within bars are number of parameters) during the wake phase for ER-ACE on Tiny-ImageNet1/2. Right: number of parameter updates for the whole training of ER-ACE with and without WSCL on Tiny-ImageNet1/2 (from 10 epochs to 100 training epochs).
Figure 5: WSCL model efficiency. Left: the most frequent automatically learned freezing scheme (values within bars are number of parameters) during the wake phase for ER-ACE on FG-ImageNet. Right: number of parameter updates for the whole training of ER-ACE with and without WSCL on FG-ImageNet (from 10 epochs to 100 training epochs). The numbers above the green bars represent the improvement in percent points with respect to the baseline alone.

Wake-Sleep Consolidated Learning

TL;DR

Abstract

Wake-Sleep Consolidated Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)