Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions
Wenbo Wei, Nicholas Chong Jia Le, Choy Heng Lai, Ling Feng
TL;DR
This study investigates training dynamics in deep learning by applying an asymptotic stability framework to LSTMs trained on IMDb sentiment analysis. It reveals multiple descents—cycles of rising and sharply dropping test loss—that align with order-chaos transitions, with the global optimum at the first transition where the edge of chaos is widest. The approach links dynamical systems concepts to training behavior and suggests epoch-level strategies to exploit chaotic regimes for improved generalization, potentially extending beyond LSTMs.
Abstract
We observe a novel 'multiple-descent' phenomenon during the training process of LSTM, in which the test loss goes through long cycles of up and down trend multiple times after the model is overtrained. By carrying out asymptotic stability analysis of the models, we found that the cycles in test loss are closely associated with the phase transition process between order and chaos, and the local optimal epochs are consistently at the critical transition point between the two phases. More importantly, the global optimal epoch occurs at the first transition from order to chaos, where the 'width' of the 'edge of chaos' is the widest, allowing the best exploration of better weight configurations for learning.
