Gradual Divergence for Seamless Adaptation: A Novel Domain Incremental Learning Method
Kishaan Jeeveswaran, Elahe Arani, Bahram Zonooz
TL;DR
This work tackles domain incremental learning by introducing DARE, a three-stage training protocol (Divergence, Adaptation, Refinement) that gradually molds new-domain representations into the subspace defined by prior domains, thereby reducing representation drift and catastrophic forgetting. It combines dual classifiers with cross-entropy and supervised contrastive objectives and leverages an Intermediary Reservoir Sampling buffer to preserve dark knowledge across tasks. Empirical results on DN4IL and iCIFAR-20 show that DARE and its EMA variant consistently outperform strong rehearsal-based baselines, especially in memory-constrained settings, and analyses confirm reduced drift, better task balancing, and improved calibration. The approach offers a practical, memory-efficient path for robust DIL in real-world, shifting-domain environments, though it currently relies on task-id information for IRS and could benefit from automatic task-transition detection in future work.
Abstract
Domain incremental learning (DIL) poses a significant challenge in real-world scenarios, as models need to be sequentially trained on diverse domains over time, all the while avoiding catastrophic forgetting. Mitigating representation drift, which refers to the phenomenon of learned representations undergoing changes as the model adapts to new tasks, can help alleviate catastrophic forgetting. In this study, we propose a novel DIL method named DARE, featuring a three-stage training process: Divergence, Adaptation, and REfinement. This process gradually adapts the representations associated with new tasks into the feature space spanned by samples from previous tasks, simultaneously integrating task-specific decision boundaries. Additionally, we introduce a novel strategy for buffer sampling and demonstrate the effectiveness of our proposed method, combined with this sampling strategy, in reducing representation drift within the feature encoder. This contribution effectively alleviates catastrophic forgetting across multiple DIL benchmarks. Furthermore, our approach prevents sudden representation drift at task boundaries, resulting in a well-calibrated DIL model that maintains the performance on previous tasks.
