Table of Contents
Fetching ...

Data-Efficient CLIP-Powered Dual-Branch Networks for Source-Free Unsupervised Domain Adaptation

Yongguang Li, Yueqi Cao, Jindong Li, Qi Wang, Shengsheng Wang

TL;DR

A data-efficient, CLIP-powered dual-branch network that preserves the classification capability learned from the source domain while generating more confident and diverse predictions in the target domain, and proposes an unsupervised optimization strategy driven by accurate classification and diversity.

Abstract

Source-free Unsupervised Domain Adaptation (SF-UDA) aims to transfer a model's performance from a labeled source domain to an unlabeled target domain without direct access to source samples, addressing critical data privacy concerns. However, most existing SF-UDA approaches assume the availability of abundant source domain samples, which is often impractical due to the high cost of data annotation. To address the dual challenges of limited source data and privacy concerns, we introduce a data-efficient, CLIP-powered dual-branch network (CDBN). This architecture consists of a cross-domain feature transfer branch and a target-specific feature learning branch, leveraging high-confidence target domain samples to transfer text features of source domain categories while learning target-specific soft prompts. By fusing the outputs of both branches, our approach not only effectively transfers source domain category semantic information to the target domain but also reduces the negative impacts of noise and domain gaps during target training. Furthermore, we propose an unsupervised optimization strategy driven by accurate classification and diversity, preserving the classification capability learned from the source domain while generating more confident and diverse predictions in the target domain. CDBN achieves near state-of-the-art performance with far fewer source domain samples than existing methods across 31 transfer tasks on seven datasets.

Data-Efficient CLIP-Powered Dual-Branch Networks for Source-Free Unsupervised Domain Adaptation

TL;DR

A data-efficient, CLIP-powered dual-branch network that preserves the classification capability learned from the source domain while generating more confident and diverse predictions in the target domain, and proposes an unsupervised optimization strategy driven by accurate classification and diversity.

Abstract

Source-free Unsupervised Domain Adaptation (SF-UDA) aims to transfer a model's performance from a labeled source domain to an unlabeled target domain without direct access to source samples, addressing critical data privacy concerns. However, most existing SF-UDA approaches assume the availability of abundant source domain samples, which is often impractical due to the high cost of data annotation. To address the dual challenges of limited source data and privacy concerns, we introduce a data-efficient, CLIP-powered dual-branch network (CDBN). This architecture consists of a cross-domain feature transfer branch and a target-specific feature learning branch, leveraging high-confidence target domain samples to transfer text features of source domain categories while learning target-specific soft prompts. By fusing the outputs of both branches, our approach not only effectively transfers source domain category semantic information to the target domain but also reduces the negative impacts of noise and domain gaps during target training. Furthermore, we propose an unsupervised optimization strategy driven by accurate classification and diversity, preserving the classification capability learned from the source domain while generating more confident and diverse predictions in the target domain. CDBN achieves near state-of-the-art performance with far fewer source domain samples than existing methods across 31 transfer tasks on seven datasets.

Paper Structure

This paper contains 21 sections, 14 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Scheme 1 illustrates the issue with previous SF-UDA methods, where using a fully trained model from a source domain as an information source for a target domain leads to poor performance when source domain samples are limited. To address this issue, Scheme 2 presents our proposed solution, which involves transferring category feature representations obtained from a source-domain-based prompt learning method to the target domain, effectively resolving the SF-UDA problem in scenarios with few source domain samples.
  • Figure 2: The framework of CDBN, where $E_{img}$ represents the image encoder, $E_{text}$ represents the text encoder, $T_t$ denotes the class-shared soft prompt, and $V_c^L$ indicates the learnable token for the substitute class name of class c. Solid lines represent data flow, while dashed lines indicate the flow of values without gradient back-propagation. The flame symbol represents gradient updates, and the snowflake symbol indicates frozen gradients.
  • Figure 3: Trends in category average accuracy for two different branches during the Product $\to$ Real World transfer task across training epochs.
  • Figure 4: The average accuracy (%) on different $\alpha$.
  • Figure 5: The t-SNE visualization of R $\to$ P transfer tasks in Office-Home. Different colored dots represent the sample features of the target domain, while black pentagrams and magenta inverted triangles denote categorical text features. (a) Categorical text features derived using the manually designed prompt "a photo of a" from CLIP; (b) Categorical text features obtained from the source domain after prompt learning; (c) Two different types of categorical text features obtained from our dual-branch network after unsupervised training, where the black pentagrams indicate the cross-domain feature transfer branch and the magenta inverted triangles indicate the target-specific feature learning branch.