Table of Contents
Fetching ...

AdaEmbed: Semi-supervised Domain Adaptation in the Embedding Space

Ali Mottaghi, Mohammad Abdullah Jamal, Serena Yeung, Omid Mohareri

TL;DR

AdaEmbed tackles domain shift in vision tasks by learning a shared embedding space and using prototype-based pseudo-labels to guide semi-supervised domain adaptation. It combines cross-entropy supervision, balanced pseudo-labeling via k-NN to class prototypes, and instance-level contrastive alignment with a memory bank, while updating class prototypes through entropy-driven signals in a minimax training setup. Extensive experiments on DomainNet-126, Office-Home, and VisDA-C demonstrate state-of-the-art performance in both SSDA and UDA settings, with ablations confirming the critical roles of pseudo-labeling, contrastive loss, and entropy. The method is model-agnostic, data-efficient, and practical for real-world deployment, with code released to foster further research.

Abstract

Semi-supervised domain adaptation (SSDA) presents a critical hurdle in computer vision, especially given the frequent scarcity of labeled data in real-world settings. This scarcity often causes foundation models, trained on extensive datasets, to underperform when applied to new domains. AdaEmbed, our newly proposed methodology for SSDA, offers a promising solution to these challenges. Leveraging the potential of unlabeled data, AdaEmbed facilitates the transfer of knowledge from a labeled source domain to an unlabeled target domain by learning a shared embedding space. By generating accurate and uniform pseudo-labels based on the established embedding space, the model overcomes the limitations of conventional SSDA, thus enhancing performance significantly. Our method's effectiveness is validated through extensive experiments on benchmark datasets such as DomainNet, Office-Home, and VisDA-C, where AdaEmbed consistently outperforms all the baselines, setting a new state of the art for SSDA. With its straightforward implementation and high data efficiency, AdaEmbed stands out as a robust and pragmatic solution for real-world scenarios, where labeled data is scarce. To foster further research and application in this area, we are sharing the codebase of our unified framework for semi-supervised domain adaptation.

AdaEmbed: Semi-supervised Domain Adaptation in the Embedding Space

TL;DR

AdaEmbed tackles domain shift in vision tasks by learning a shared embedding space and using prototype-based pseudo-labels to guide semi-supervised domain adaptation. It combines cross-entropy supervision, balanced pseudo-labeling via k-NN to class prototypes, and instance-level contrastive alignment with a memory bank, while updating class prototypes through entropy-driven signals in a minimax training setup. Extensive experiments on DomainNet-126, Office-Home, and VisDA-C demonstrate state-of-the-art performance in both SSDA and UDA settings, with ablations confirming the critical roles of pseudo-labeling, contrastive loss, and entropy. The method is model-agnostic, data-efficient, and practical for real-world deployment, with code released to foster further research.

Abstract

Semi-supervised domain adaptation (SSDA) presents a critical hurdle in computer vision, especially given the frequent scarcity of labeled data in real-world settings. This scarcity often causes foundation models, trained on extensive datasets, to underperform when applied to new domains. AdaEmbed, our newly proposed methodology for SSDA, offers a promising solution to these challenges. Leveraging the potential of unlabeled data, AdaEmbed facilitates the transfer of knowledge from a labeled source domain to an unlabeled target domain by learning a shared embedding space. By generating accurate and uniform pseudo-labels based on the established embedding space, the model overcomes the limitations of conventional SSDA, thus enhancing performance significantly. Our method's effectiveness is validated through extensive experiments on benchmark datasets such as DomainNet, Office-Home, and VisDA-C, where AdaEmbed consistently outperforms all the baselines, setting a new state of the art for SSDA. With its straightforward implementation and high data efficiency, AdaEmbed stands out as a robust and pragmatic solution for real-world scenarios, where labeled data is scarce. To foster further research and application in this area, we are sharing the codebase of our unified framework for semi-supervised domain adaptation.
Paper Structure (26 sections, 10 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 26 sections, 10 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: AdaEmbed method for semi-supervised domain adaptation (SSDA). The embedding space for a general SSDA problem is shown in (a), while (b) illustrates how AdaEmbed estimates prototypes based on labeled and unlabeled samples in the feature space, selects unlabeled samples in the embedding space to generate pseudo-labels given their proximity to prototypes, and trains the model on both labels and pseudo-labels. In addition, a contrastive feature loss is incorporated during training to learn a more effective shared embedding for both source and target domains.
  • Figure 2: The framework for our semi-supervised domain adaptation method (AdaEmbed). The left column visualises the types of data points and their corresponding symbol in this figure. At the beginning of adaptation process both the encoder and momentum encoder are initialized with the model trained on labeled data. At each iteration we augment the data and send the augmented version ($\tilde{x}$) to our encoder while the original version ($x$) is passed through the momentum encoder. The features generated by the encoders are then given to the classifier to generate the predictions ($p$ and $\tilde{p}$). For labeled samples we compute the cross entropy loss between the given label and the corresponding prediction ($\mathcal{L}_s$). For unlabeled target samples we first generate pseudo-label ($\hat{y}$) for a few samples based on their proximity to the class prototypes, and use the pseudo-labels for the target loss ($\mathcal{L}_t$). We also unitize an instance contrastive alignment loss ($\mathcal{L}_c$) to push the original and augmented features ($f'$ and $\tilde{f}$) together while pushing features with different labels/pseudo-labels apart. Finally the prototypes are updated based on the entropy of unlabeled target predictions ($\mathcal{H}$). The training algorithm for AdaEmbed is outlined in Algorithm \ref{['algo:adaembed']}.
  • Figure 3: The impact of varying the number of labeled examples on the accuracy of domain adaptation methods in the OfficeHome dataset's Real to Clipart split. The plot demonstrates the performance of three strategies: Supervised Learning, AdaContrast, and AdaEmbed. Each method's accuracy improves with more labeled data, with AdaEmbed showing the highest efficiency and performance across all data regimes, especially in settings with fewer labeled examples.
  • Figure 4: Comparative t-SNE visualizations of feature embeddings from different domain adaptation methods on the DomainNet Real to Clipart split under unsupervised domain adaptation setting. Green and orange circles represent the source and target domains, respectively, while purple crosses mark the prototypes. The Supervised Learning method exhibits basic clustering with overlaps, reflecting a foundational level of domain adaptation. AdaMatch and AdaContrast display progressively tighter and more distinct clusters, signaling advancements in domain invariance. However, AdaEmbed stands out distinctly, showcasing highly compact and distinctly separated clusters. These embeddings visually encapsulate the superior domain adaptation proficiency of AdaEmbed.