Table of Contents
Fetching ...

Probabilistic Contrastive Learning for Domain Adaptation

Junjie Li, Yixin Zhang, Zilei Wang, Saihui Hou, Keyu Tu, Man Zhang

TL;DR

This work addresses the domain adaptation gap by diagnosing that standard contrastive learning fails to align target features with source class weights. It introduces Probabilistic Contrastive Learning (PCL), which substitutes feature vectors with probability distributions and eliminates $\ell_{2}$ normalization, guiding representations toward one-hot class weight alignment. Across five tasks—UDA/SSDA, SSL, detection, and segmentation—PCL yields consistent gains and often outperforms more complex methods while requiring less training time. The approach demonstrates that simple probabilistic inputs rooted in the InfoNCE framework can significantly mitigate feature–weight deviation, with strong potential for broader application in domain-shifted visual tasks.

Abstract

Contrastive learning has shown impressive success in enhancing feature discriminability for various visual tasks in a self-supervised manner, but the standard contrastive paradigm (features+$\ell_{2}$ normalization) has limited benefits when applied in domain adaptation. We find that this is mainly because the class weights (weights of the final fully connected layer) are ignored in the domain adaptation optimization process, which makes it difficult for features to cluster around the corresponding class weights. To solve this problem, we propose the \emph{simple but powerful} Probabilistic Contrastive Learning (PCL), which moves beyond the standard paradigm by removing $\ell_{2}$ normalization and replacing the features with probabilities. PCL can guide the probability distribution towards a one-hot configuration, thus minimizing the discrepancy between features and class weights. We conduct extensive experiments to validate the effectiveness of PCL and observe consistent performance gains on five tasks, i.e., Unsupervised/Semi-Supervised Domain Adaptation (UDA/SSDA), Semi-Supervised Learning (SSL), UDA Detection and Semantic Segmentation. Notably, for UDA Semantic Segmentation on SYNTHIA, PCL surpasses the sophisticated CPSL-D by $>\!2\%$ in terms of mean IoU with a much lower training cost (PCL: 1*3090, 5 days v.s. CPSL-D: 4*V100, 11 days). Code is available at https://github.com/ljjcoder/Probabilistic-Contrastive-Learning.

Probabilistic Contrastive Learning for Domain Adaptation

TL;DR

This work addresses the domain adaptation gap by diagnosing that standard contrastive learning fails to align target features with source class weights. It introduces Probabilistic Contrastive Learning (PCL), which substitutes feature vectors with probability distributions and eliminates normalization, guiding representations toward one-hot class weight alignment. Across five tasks—UDA/SSDA, SSL, detection, and segmentation—PCL yields consistent gains and often outperforms more complex methods while requiring less training time. The approach demonstrates that simple probabilistic inputs rooted in the InfoNCE framework can significantly mitigate feature–weight deviation, with strong potential for broader application in domain-shifted visual tasks.

Abstract

Contrastive learning has shown impressive success in enhancing feature discriminability for various visual tasks in a self-supervised manner, but the standard contrastive paradigm (features+ normalization) has limited benefits when applied in domain adaptation. We find that this is mainly because the class weights (weights of the final fully connected layer) are ignored in the domain adaptation optimization process, which makes it difficult for features to cluster around the corresponding class weights. To solve this problem, we propose the \emph{simple but powerful} Probabilistic Contrastive Learning (PCL), which moves beyond the standard paradigm by removing normalization and replacing the features with probabilities. PCL can guide the probability distribution towards a one-hot configuration, thus minimizing the discrepancy between features and class weights. We conduct extensive experiments to validate the effectiveness of PCL and observe consistent performance gains on five tasks, i.e., Unsupervised/Semi-Supervised Domain Adaptation (UDA/SSDA), Semi-Supervised Learning (SSL), UDA Detection and Semantic Segmentation. Notably, for UDA Semantic Segmentation on SYNTHIA, PCL surpasses the sophisticated CPSL-D by in terms of mean IoU with a much lower training cost (PCL: 1*3090, 5 days v.s. CPSL-D: 4*V100, 11 days). Code is available at https://github.com/ljjcoder/Probabilistic-Contrastive-Learning.

Paper Structure

This paper contains 35 sections, 8 equations, 4 figures, 14 tables.

Figures (4)

  • Figure 1: Feature Contrastive Learning (FSL) v.s. Probabilistic Contrastive Learning (PCL). With PCL, the features on target domain can be clustered around the corresponding class weights.
  • Figure 2: An explorative study under the SSDA setting on DomainNet (R$\rightarrow$S) with 3-shot and ResNet34. We use MME as a baseline model.
  • Figure 3: Framework of FCL and PCL. Different from FCL, PCL uses the output of softmax to perform contrastive learning and removes the $\ell_{2}$ normalization.
  • Figure 4: The t-SNE visualization of learned features. Best viewed in color.