The Expanding Scope of the Stability Gap: Unveiling its Presence in Joint Incremental Learning of Homogeneous Tasks
Sandesh Kamath, Albin Soutif-Cormerais, Joost van de Weijer, Bogdan Raducanu
TL;DR
The paper demonstrates that the stability gap, a transient drop in performance on previously learned tasks at the start of a new task, also occurs during joint incremental learning on homogeneous task distributions. It shows there exists a low-loss linear path between task minima, defined by $ heta_{ extλ} = extλ heta_1 + (1- extλ) heta_2$, but SGD does not follow this path and initially traverses higher-loss regions; mini-batch analysis reveals per-batch improvements do not translate into better test performance. The findings hold across architectures and data splits, and removing rehearsal further amplifies the gap, highlighting that optimization dynamics—not just data or task heterogeneity—drive the phenomenon. This points to focusing on optimization strategies and path-aware training approaches to mitigate the stability gap in practical continual learning systems.
Abstract
Recent research identified a temporary performance drop on previously learned tasks when transitioning to a new one. This drop is called the stability gap and has great consequences for continual learning: it complicates the direct employment of continually learning since the worse-case performance at task-boundaries is dramatic, it limits its potential as an energy-efficient training paradigm, and finally, the stability drop could result in a reduced final performance of the algorithm. In this paper, we show that the stability gap also occurs when applying joint incremental training of homogeneous tasks. In this scenario, the learner continues training on the same data distribution and has access to all data from previous tasks. In addition, we show that in this scenario, there exists a low-loss linear path to the next minima, but that SGD optimization does not choose this path. We perform further analysis including a finer batch-wise analysis which could provide insights towards potential solution directions.
