Towards Cross-Domain Continual Learning

Marcus de Carvalho; Mahardhika Pratama; Jie Zhang; Chua Haoyan; Edward Yapp

Towards Cross-Domain Continual Learning

Marcus de Carvalho, Mahardhika Pratama, Jie Zhang, Chua Haoyan, Edward Yapp

TL;DR

CDCL tackles unsupervised cross-domain continual learning by enabling knowledge transfer across related unlabeled domains while preserving previously learned tasks. It introduces an inter- and intra-task cross-attention mechanism and an intra-task center-aware pseudo-labeling strategy, complemented by a rehearsal memory to align features and mitigate forgetting under sequential tasks, with theoretical bounds linking target error to domain discrepancy and memory-replay terms. The approach is validated across five UDA benchmarks (e.g., VisDA-2017, Office-Home, Office-31, DomainNet, MNIST↔USPS), where CDCL outperforms state-of-the-art baselines in task-incremental settings, highlighting the practical value of cross-domain attentional learning for lifelong adaptation. The work provides reproducible code and points to future extensions toward fully class-incremental cross-domain continual learning, emphasizing the potential of cross-domain attention to advance real-world continual learning applications.

Abstract

Continual learning is a process that involves training learning agents to sequentially master a stream of tasks or classes without revisiting past data. The challenge lies in leveraging previously acquired knowledge to learn new tasks efficiently, while avoiding catastrophic forgetting. Existing methods primarily focus on single domains, restricting their applicability to specific problems. In this work, we introduce a novel approach called Cross-Domain Continual Learning (CDCL) that addresses the limitations of being limited to single supervised domains. Our method combines inter- and intra-task cross-attention mechanisms within a compact convolutional network. This integration enables the model to maintain alignment with features from previous tasks, thereby delaying the data drift that may occur between tasks, while performing unsupervised cross-domain (UDA) between related domains. By leveraging an intra-task-specific pseudo-labeling method, we ensure accurate input pairs for both labeled and unlabeled samples, enhancing the learning process. To validate our approach, we conduct extensive experiments on public UDA datasets, showcasing its positive performance on cross-domain continual learning challenges. Additionally, our work introduces incremental ideas that contribute to the advancement of this field. We make our code and models available to encourage further exploration and reproduction of our results: \url{https://github.com/Ivsucram/CDCL}

Towards Cross-Domain Continual Learning

TL;DR

Abstract

Paper Structure (18 sections, 3 theorems, 34 equations, 2 figures, 4 tables, 1 algorithm)

This paper contains 18 sections, 3 theorems, 34 equations, 2 figures, 4 tables, 1 algorithm.

Introduction
Related work
Unsupervised Domain-Adaptation
Continual learning
Problem formulation
Proposed method
Inter- intra-task cross-attention mechanism
Intra-task center-aware pseudo-label
Sample rehearsal
Time Complexity Analysis
Theoretical analysis
Experiments
Benchmarks
Baselines
Metrics
...and 3 more sections

Key Result

Theorem 1

Theorem 1: Given two domain distributions $\mathcal{D}_\mathrm{S}(X_\mathrm{S})$ and $\mathcal{D}_\mathrm{T}(X_\mathrm{T})$, the target domain error $\varepsilon_\mathrm{T}$ is bound by ben2010theory: where $d_{\mathcal{H}\Delta\mathcal{H}}(X_\mathrm{S}, X_\mathrm{T})$ is the $\mathcal{H}\Delta\mathcal{H}$ divergence, which relies on the capacity of the hypothesis class $\mathcal{H}$ to distingui

Figures (2)

Figure 1: The proposed framework. The main contribution - the inter- intra-task cross attention - is highlighted with purple, which is responsible for aligning the source and target feature domains, and mitigating its feature-alignment catastrophic forgetting when new tasks arrive. When the network is faced with only a single input at a time, source (represented by the dashed blue line) or target (represented by the blocked red line), CDCL will process it via the self-attention mechanism (left side block). When the network is faced with both source and target inputs at the same time, CDCL will process them via the cross-attention mechanism (right side block), which outputs a mixed signal (represented by the blocked green open arrow). $\mathbf{b}_i$ is omitted for simplicity. The $f^\text{CIL}(\cdot)$ is a single-head output used for CIL scenarios along with the latest $\mathbf{K}_T$ and $\mathbf{b}_T$ instantiated. Meanwhile, the $f^\text{TIL}(\cdot)$ is a multi-head output used for TIL scenarios with the respective $\mathbf{K}_i$ and $\mathbf{b}_i$, as the task-identifier $t_i$ is provided.
Figure 2: The evolution of CDCL's ACC in the VisDA-2017 for both TIL and CIL scenarios. The shared area represents the standard deviation of $R_{i,j}, i \in [1, j]$, the accuracy on such a task by a model that learned only previous tasks.

Theorems & Definitions (4)

Theorem 1
Theorem 2
Theorem 3
proof

Towards Cross-Domain Continual Learning

TL;DR

Abstract

Towards Cross-Domain Continual Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (4)