Table of Contents
Fetching ...

Revisiting CLIP for SF-OSDA: Unleashing Zero-Shot Potential with Adaptive Threshold and Training-Free Feature Filtering

Yongguang Li, Jindong Li, Qi Wang, Qianli Xing, Runliang Niu, Shengsheng Wang, Menglin Yang

TL;DR

This work addresses Source-Free Unsupervised Open-Set Domain Adaptation (SF-OSDA) with CLIP by tackling two core problems: dependence on fixed, domain-specific thresholds and unnecessary training-costs that can shift CLIP features. It introduces CLIPXpert, a training-free, source-free framework that combines Box-Cox GMM-Based Adaptive Thresholding (BGAT) to derive a robust $T^*$ from score distributions and SVD-Based Unknown-Class Feature Filtering (SUFF) to suppress unknown-class bias in the feature space. BGAT dynamically models score distributions and derives $T^*$ via the intersection of Gaussian PDFs after a Box-Cox transformation, while SUFF reconstructs feature spaces using principal components to separate known and unknown classes without additional training. Across Office-Home, VisDA-2017, DomainNet, and VATB benchmarks, CLIPXpert achieves competitive or state-of-the-art results with notable gains over fixed-threshold baselines and other CLIP-based methods, underscoring CLIP's strong zero-shot potential for SF-OSDA in resource-constrained settings.

Abstract

Source-Free Unsupervised Open-Set Domain Adaptation (SF-OSDA) methods using CLIP face significant issues: (1) while heavily dependent on domain-specific threshold selection, existing methods employ simple fixed thresholds, underutilizing CLIP's zero-shot potential in SF-OSDA scenarios; and (2) overlook intrinsic class tendencies while employing complex training to enforce feature separation, incurring deployment costs and feature shifts that compromise CLIP's generalization ability. To address these issues, we propose CLIPXpert, a novel SF-OSDA approach that integrates two key components: an adaptive thresholding strategy and an unknown class feature filtering module. Specifically, the Box-Cox GMM-Based Adaptive Thresholding (BGAT) module dynamically determines the optimal threshold by estimating sample score distributions, balancing known class recognition and unknown class sample detection. Additionally, the Singular Value Decomposition (SVD)-Based Unknown-Class Feature Filtering (SUFF) module reduces the tendency of unknown class samples towards known classes, improving the separation between known and unknown classes. Experiments show that our source-free and training-free method outperforms state-of-the-art trained approach UOTA by 1.92% on the DomainNet dataset, achieves SOTA-comparable performance on datasets such as Office-Home, and surpasses other SF-OSDA methods. This not only validates the effectiveness of our proposed method but also highlights CLIP's strong zero-shot potential for SF-OSDA tasks.

Revisiting CLIP for SF-OSDA: Unleashing Zero-Shot Potential with Adaptive Threshold and Training-Free Feature Filtering

TL;DR

This work addresses Source-Free Unsupervised Open-Set Domain Adaptation (SF-OSDA) with CLIP by tackling two core problems: dependence on fixed, domain-specific thresholds and unnecessary training-costs that can shift CLIP features. It introduces CLIPXpert, a training-free, source-free framework that combines Box-Cox GMM-Based Adaptive Thresholding (BGAT) to derive a robust from score distributions and SVD-Based Unknown-Class Feature Filtering (SUFF) to suppress unknown-class bias in the feature space. BGAT dynamically models score distributions and derives via the intersection of Gaussian PDFs after a Box-Cox transformation, while SUFF reconstructs feature spaces using principal components to separate known and unknown classes without additional training. Across Office-Home, VisDA-2017, DomainNet, and VATB benchmarks, CLIPXpert achieves competitive or state-of-the-art results with notable gains over fixed-threshold baselines and other CLIP-based methods, underscoring CLIP's strong zero-shot potential for SF-OSDA in resource-constrained settings.

Abstract

Source-Free Unsupervised Open-Set Domain Adaptation (SF-OSDA) methods using CLIP face significant issues: (1) while heavily dependent on domain-specific threshold selection, existing methods employ simple fixed thresholds, underutilizing CLIP's zero-shot potential in SF-OSDA scenarios; and (2) overlook intrinsic class tendencies while employing complex training to enforce feature separation, incurring deployment costs and feature shifts that compromise CLIP's generalization ability. To address these issues, we propose CLIPXpert, a novel SF-OSDA approach that integrates two key components: an adaptive thresholding strategy and an unknown class feature filtering module. Specifically, the Box-Cox GMM-Based Adaptive Thresholding (BGAT) module dynamically determines the optimal threshold by estimating sample score distributions, balancing known class recognition and unknown class sample detection. Additionally, the Singular Value Decomposition (SVD)-Based Unknown-Class Feature Filtering (SUFF) module reduces the tendency of unknown class samples towards known classes, improving the separation between known and unknown classes. Experiments show that our source-free and training-free method outperforms state-of-the-art trained approach UOTA by 1.92% on the DomainNet dataset, achieves SOTA-comparable performance on datasets such as Office-Home, and surpasses other SF-OSDA methods. This not only validates the effectiveness of our proposed method but also highlights CLIP's strong zero-shot potential for SF-OSDA tasks.

Paper Structure

This paper contains 19 sections, 28 equations, 14 figures, 11 tables, 1 algorithm.

Figures (14)

  • Figure 1: Analysis of CLIP's performance on representative datasets: (a) CLIP’s HOS under different thresholds, where the dashed lines indicate the optimal thresholds in different domains. (b) Distribution of known-class predictions for the Top-25 unknown-class samples with the strongest tendency. (c) Distribution of scores for known-class and unknown-class samples, and the dashed line represents the probability density functions of the two distributions fitted by GMM.
  • Figure 2: Overview of CLIPXpert, illustrating data flow from the unlabeled dataset and class names to the final classification result.
  • Figure 3: MCM (CLIP)
  • Figure 4: Entropy (CLIP)
  • Figure 5: VAR (CLIP)
  • ...and 9 more figures