Prompt-based Distribution Alignment for Unsupervised Domain Adaptation
Shuanghao Bai, Min Zhang, Wanqi Zhou, Siteng Huang, Zhirong Luan, Donglin Wang, Badong Chen
TL;DR
This work investigates unsupervised domain adaptation (UDA) using vision-language models (VLMs) by introducing Prompt-based Distribution Alignment (PDA). PDA employs a two-branch design: a base branch that uses multi-modal prompt tuning to produce discriminative representations, and an alignment branch that builds source/target feature banks and applies image-guided feature tuning (IFT) to reduce domain discrepancy. The method optimizes a combined contrastive objective with both source and pseudo-labeled target data, achieving state-of-the-art performance on Office-Home, Office-31, and VisDA-2017 while maintaining efficiency through prompt-based adaptation. The results demonstrate that domain-aware prompt learning, coupled with feature-bank–guided alignment, yields robust cross-domain transfer with practical implications for real-world UDA tasks.
Abstract
Recently, despite the unprecedented success of large pre-trained visual-language models (VLMs) on a wide range of downstream tasks, the real-world unsupervised domain adaptation (UDA) problem is still not well explored. Therefore, in this paper, we first experimentally demonstrate that the unsupervised-trained VLMs can significantly reduce the distribution discrepancy between source and target domains, thereby improving the performance of UDA. However, a major challenge for directly deploying such models on downstream UDA tasks is prompt engineering, which requires aligning the domain knowledge of source and target domains, since the performance of UDA is severely influenced by a good domain-invariant representation. We further propose a Prompt-based Distribution Alignment (PDA) method to incorporate the domain knowledge into prompt learning. Specifically, PDA employs a two-branch prompt-tuning paradigm, namely base branch and alignment branch. The base branch focuses on integrating class-related representation into prompts, ensuring discrimination among different classes. To further minimize domain discrepancy, for the alignment branch, we construct feature banks for both the source and target domains and propose image-guided feature tuning (IFT) to make the input attend to feature banks, which effectively integrates self-enhanced and cross-domain features into the model. In this way, these two branches can be mutually promoted to enhance the adaptation of VLMs for UDA. We conduct extensive experiments on three benchmarks to demonstrate that our proposed PDA achieves state-of-the-art performance. The code is available at https://github.com/BaiShuanghao/Prompt-based-Distribution-Alignment.
