Consistent Assistant Domains Transformer for Source-free Domain Adaptation
Renrong Shao, Wei Zhang, Kangyang Luo, Qin Li, and Jun Wang
TL;DR
The paper tackles source-free domain adaptation by introducing CADTrans, which constructs a plug-in assistant domain (ADM) from aggregated global attention in a Vision Transformer to derive invariant features. It then employs domain consistency strategies to separate easy source-like samples from hard target-specific samples and uses conditional multi-kernel MMD (CMK-MMD) to align hard to easy samples, improving SFDA robustness. The approach achieves strong performance gains on Office-31, Office-Home, VISDA-C, and DomainNet-126, demonstrating the value of an assistant domain, self-distillation, and sample-wise alignment in transformer-based SFDA. This work offers a practical, scalable framework for reducing domain shift without accessing source data, with potential applicability to other vision tasks and future lightweight adaptations.
Abstract
Source-free domain adaptation (SFDA) aims to address the challenge of adapting to a target domain without accessing the source domain directly. However, due to the inaccessibility of source domain data, deterministic invariable features cannot be obtained. Current mainstream methods primarily focus on evaluating invariant features in the target domain that closely resemble those in the source domain, subsequently aligning the target domain with the source domain. However, these methods are susceptible to hard samples and influenced by domain bias. In this paper, we propose a Consistent Assistant Domains Transformer for SFDA, abbreviated as CADTrans, which solves the issue by constructing invariable feature representations of domain consistency. Concretely, we develop an assistant domain module for CADTrans to obtain diversified representations from the intermediate aggregated global attentions, which addresses the limitation of existing methods in adequately representing diversity. Based on assistant and target domains, invariable feature representations are obtained by multiple consistent strategies, which can be used to distinguish easy and hard samples. Finally, to align the hard samples to the corresponding easy samples, we construct a conditional multi-kernel max mean discrepancy (CMK-MMD) strategy to distinguish between samples of the same category and those of different categories. Extensive experiments are conducted on various benchmarks such as Office-31, Office-Home, VISDA-C, and DomainNet-126, proving the significant performance improvements achieved by our proposed approaches. Code is available at https://github.com/RoryShao/CADTrans.git.
