Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning
Da-Wei Zhou, Zi-Wen Cai, Han-Jia Ye, Lijun Zhang, De-Chuan Zhan
TL;DR
Domain-Incremental Learning with pre-trained models is hampered by forgetting in both features and the classifier as domains shift. Duct tackles this with two coordinated strategies: representation consolidation, which builds a unified embedding by merging task vectors from historical backbones weighted by task similarity, and classifier consolidation, which realigns old classifiers to the consolidated space via optimal transport guided by class-wise semantic costs. The method uses a streamlined, exemplar-free setup and maintains only two backbones, achieving state-of-the-art results on four benchmarks and showing robustness across task orders and backbones. This dual consolidation enables stable, scalable continual adaptation of PTMs in dynamic environments. Key equations include φ^m_i = φ_0 + α_φ ∑_{k=1}^{i} Sim_{0,k} δφ_k and W_o^m = (1 − α_W) W_o + α_W W_n T with T obtained from an OT problem using costs Q_{i,j} = || c^0_i − c^0_j ||^2.
Abstract
Domain-Incremental Learning (DIL) involves the progressive adaptation of a model to new concepts across different domains. While recent advances in pre-trained models provide a solid foundation for DIL, learning new concepts often results in the catastrophic forgetting of pre-trained knowledge. Specifically, sequential model updates can overwrite both the representation and the classifier with knowledge from the latest domain. Thus, it is crucial to develop a representation and corresponding classifier that accommodate all seen domains throughout the learning process. To this end, we propose DUal ConsolidaTion (Duct) to unify and consolidate historical knowledge at both the representation and classifier levels. By merging the backbone of different stages, we create a representation space suitable for multiple domains incrementally. The merged representation serves as a balanced intermediary that captures task-specific features from all seen domains. Additionally, to address the mismatch between consolidated embeddings and the classifier, we introduce an extra classifier consolidation process. Leveraging class-wise semantic information, we estimate the classifier weights of old domains within the latest embedding space. By merging historical and estimated classifiers, we align them with the consolidated embedding space, facilitating incremental classification. Extensive experimental results on four benchmark datasets demonstrate Duct's state-of-the-art performance. Code is available at https://github.com/Estrella-fugaz/CVPR25-Duct
