Overcoming Domain Drift in Online Continual Learning
Fan Lyu, Daofeng Liu, Linglan Zhao, Zhang Zhang, Fanhua Shang, Fuyuan Hu, Wei Feng, Liang Wang
TL;DR
This work tackles online continual learning (OCL) where models must learn from a stream of tasks without retraining on past data, a setting prone to catastrophic forgetting due to continual domain drift. It introduces Drift-Reducing Rehearsal (DRR), a rehearsal-based approach that anchors old-task domains using Centroid-based Online Selection (COS) and a Cross-Task Contrastive Margin Loss (CML), with an optional Centroid Distillation Loss (CDL) to further stabilize the feature space. DRR integrates a two-level angular margin framework ($m^\text{c}$, $m^\text{t}$) to tighten intra-class/task clusters while expanding inter-class/task separations, thereby reducing negative transfer between tasks. Empirical results on four standard OCL benchmarks show that DRR achieves state-of-the-art performance, effectively mitigating continual domain drift and preserving knowledge across tasks while maintaining competitive training efficiency. This approach offers a scalable and data-efficient solution for online continual learning in dynamic environments.
Abstract
Online Continual Learning (OCL) empowers machine learning models to acquire new knowledge online across a sequence of tasks. However, OCL faces a significant challenge: catastrophic forgetting, wherein the model learned in previous tasks is substantially overwritten upon encountering new tasks, leading to a biased forgetting of prior knowledge. Moreover, the continual doman drift in sequential learning tasks may entail the gradual displacement of the decision boundaries in the learned feature space, rendering the learned knowledge susceptible to forgetting. To address the above problem, in this paper, we propose a novel rehearsal strategy, termed Drift-Reducing Rehearsal (DRR), to anchor the domain of old tasks and reduce the negative transfer effects. First, we propose to select memory for more representative samples guided by constructed centroids in a data stream. Then, to keep the model from domain chaos in drifting, a two-level angular cross-task Contrastive Margin Loss (CML) is proposed, to encourage the intra-class and intra-task compactness, and increase the inter-class and inter-task discrepancy. Finally, to further suppress the continual domain drift, we present an optional Centorid Distillation Loss (CDL) on the rehearsal memory to anchor the knowledge in feature space for each previous old task. Extensive experimental results on four benchmark datasets validate that the proposed DRR can effectively mitigate the continual domain drift and achieve the state-of-the-art (SOTA) performance in OCL.
