On the Occurence of Critical Learning Periods in Neural Networks
Stanisław Pawlak
TL;DR
The paper investigates critical learning periods in neural network training and their relation to warm-starting strategies. By replicating and extending the core experiments of Achille et al. on a ResNet-18 trained on CIFAR-10 with CIFAR-C corruptions, the authors show that deficits in early training reduce plasticity, but a cyclic learning-rate schedule can restore adaptability and erase most performance gaps. They reinterpret warm-starting as deficit pretraining, demonstrate the impact of corruption severity on the learning period, and reveal that targeted deficits induce class-dependent forgetting. The study connects plasticity phenomena with practical training dynamics, offering actionable guidance for avoiding loss of plasticity during continual learning.
Abstract
This study delves into the plasticity of neural networks, offering empirical support for the notion that critical learning periods and warm-starting performance loss can be avoided through simple adjustments to learning hyperparameters. The critical learning phenomenon emerges when training is initiated with deficit data. Subsequently, after numerous deficit epochs, the network's plasticity wanes, impeding its capacity to achieve parity in accuracy with models trained from scratch, even when extensive clean data training follows deficit epochs. Building upon seminal research introducing critical learning periods, we replicate key findings and broaden the experimental scope of the main experiment from the original work. In addition, we consider a warm-starting approach and show that it can be seen as a form of deficit pretraining. In particular, we demonstrate that these problems can be averted by employing a cyclic learning rate schedule. Our findings not only impact neural network training practices but also establish a vital link between critical learning periods and ongoing research on warm-starting neural network training.
