Table of Contents
Fetching ...

A Tale of Two Experts: Cooperative Learning for Source-Free Unsupervised Domain Adaptation

Jiaping Yu, Muli Yang, Jiapeng Ji, Jiexi Yan, Cheng Deng

TL;DR

Source-Free Unsupervised Domain Adaptation (SFUDA) requires adapting a model trained on a labeled source domain while the source data is unavailable. The paper proposes Experts Cooperative Learning (EXCL), which combines a Dual Experts framework (frozen source model with Conv-Adapter and a Vision-Language Model with a trainable text prompt) with a Retrieval-Augmented-Interaction (RAIN) pipeline to learn from unlabeled target data. It introduces specific losses, including a Weiszfeld-style loss for style alignment ($L_{weisz}$), a Prompt Semantic Consistency loss ($L_{psc}$), and a Mutual Information loss ($L_{mi}$), to enable unsupervised cooperation between the two experts. Empirical results on four benchmarks show EXCL achieving state-of-the-art or competitive performance across SFUDA settings, demonstrating the value of cooperative, data-retrieval–driven adaptation without source data.

Abstract

Source-Free Unsupervised Domain Adaptation (SFUDA) addresses the realistic challenge of adapting a source-trained model to a target domain without access to the source data, driven by concerns over privacy and cost. Existing SFUDA methods either exploit only the source model's predictions or fine-tune large multimodal models, yet both neglect complementary insights and the latent structure of target data. In this paper, we propose the Experts Cooperative Learning (EXCL). EXCL contains the Dual Experts framework and Retrieval-Augmentation-Interaction optimization pipeline. The Dual Experts framework places a frozen source-domain model (augmented with Conv-Adapter) and a pretrained vision-language model (with a trainable text prompt) on equal footing to mine consensus knowledge from unlabeled target samples. To effectively train these plug-in modules under purely unsupervised conditions, we introduce Retrieval-Augmented-Interaction(RAIN), a three-stage pipeline that (1) collaboratively retrieves pseudo-source and complex target samples, (2) separately fine-tunes each expert on its respective sample set, and (3) enforces learning object consistency via a shared learning result. Extensive experiments on four benchmark datasets demonstrate that our approach matches state-of-the-art performance.

A Tale of Two Experts: Cooperative Learning for Source-Free Unsupervised Domain Adaptation

TL;DR

Source-Free Unsupervised Domain Adaptation (SFUDA) requires adapting a model trained on a labeled source domain while the source data is unavailable. The paper proposes Experts Cooperative Learning (EXCL), which combines a Dual Experts framework (frozen source model with Conv-Adapter and a Vision-Language Model with a trainable text prompt) with a Retrieval-Augmented-Interaction (RAIN) pipeline to learn from unlabeled target data. It introduces specific losses, including a Weiszfeld-style loss for style alignment (), a Prompt Semantic Consistency loss (), and a Mutual Information loss (), to enable unsupervised cooperation between the two experts. Empirical results on four benchmarks show EXCL achieving state-of-the-art or competitive performance across SFUDA settings, demonstrating the value of cooperative, data-retrieval–driven adaptation without source data.

Abstract

Source-Free Unsupervised Domain Adaptation (SFUDA) addresses the realistic challenge of adapting a source-trained model to a target domain without access to the source data, driven by concerns over privacy and cost. Existing SFUDA methods either exploit only the source model's predictions or fine-tune large multimodal models, yet both neglect complementary insights and the latent structure of target data. In this paper, we propose the Experts Cooperative Learning (EXCL). EXCL contains the Dual Experts framework and Retrieval-Augmentation-Interaction optimization pipeline. The Dual Experts framework places a frozen source-domain model (augmented with Conv-Adapter) and a pretrained vision-language model (with a trainable text prompt) on equal footing to mine consensus knowledge from unlabeled target samples. To effectively train these plug-in modules under purely unsupervised conditions, we introduce Retrieval-Augmented-Interaction(RAIN), a three-stage pipeline that (1) collaboratively retrieves pseudo-source and complex target samples, (2) separately fine-tunes each expert on its respective sample set, and (3) enforces learning object consistency via a shared learning result. Extensive experiments on four benchmark datasets demonstrate that our approach matches state-of-the-art performance.

Paper Structure

This paper contains 15 sections, 8 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Illustration of Dual Expert Frameworks. The source domain data is unavailable throughout the training steps, except for the source model and unlabeled target domain data. Two expert models must learn from each other to ensure the consistency of the learning target.
  • Figure 2: An overview of EXCL. Part A contains the Dual Experts Framework and the RAIN optimization Pipeline. The retrieval stage will update the pseudo-source and complex data throughout training to help the Weiszfeld Style Loss and the Promote Semantic Consistency Loss update. In the interaction stage, both experts will exchange their softmax output of the whole dataset to calculate the mutual information loss, ensuring the consistency of the learning target. Part B shows the location of the Conv-adapter plugged into the source model.
  • Figure 3: The t-SNE visualization shows ten classes from the Office-Home venkateswara2017deep dataset at the test stage, including all trained domain data. Three figures represent the source model processed source domain image feature, the source model processed target domain image feature, and the source model with the plugged adapter process target domain image feature.