Table of Contents
Fetching ...

LEAD: Learning Decomposition for Source-free Universal Domain Adaptation

Sanqing Qu, Tianpei Zou, Lianghua He, Florian Röhrbein, Alois Knoll, Guang Chen, Changjun Jiang

TL;DR

This paper proposes a new idea of LEArning Decomposition (LEAD), which decouples features into source-known and-unknown components to identify target-private data and is complementary to most existing methods.

Abstract

Universal Domain Adaptation (UniDA) targets knowledge transfer in the presence of both covariate and label shifts. Recently, Source-free Universal Domain Adaptation (SF-UniDA) has emerged to achieve UniDA without access to source data, which tends to be more practical due to data protection policies. The main challenge lies in determining whether covariate-shifted samples belong to target-private unknown categories. Existing methods tackle this either through hand-crafted thresholding or by developing time-consuming iterative clustering strategies. In this paper, we propose a new idea of LEArning Decomposition (LEAD), which decouples features into source-known and -unknown components to identify target-private data. Technically, LEAD initially leverages the orthogonal decomposition analysis for feature decomposition. Then, LEAD builds instance-level decision boundaries to adaptively identify target-private data. Extensive experiments across various UniDA scenarios have demonstrated the effectiveness and superiority of LEAD. Notably, in the OPDA scenario on VisDA dataset, LEAD outperforms GLC by 3.5% overall H-score and reduces 75% time to derive pseudo-labeling decision boundaries. Besides, LEAD is also appealing in that it is complementary to most existing methods. The code is available at https://github.com/ispc-lab/LEAD.

LEAD: Learning Decomposition for Source-free Universal Domain Adaptation

TL;DR

This paper proposes a new idea of LEArning Decomposition (LEAD), which decouples features into source-known and-unknown components to identify target-private data and is complementary to most existing methods.

Abstract

Universal Domain Adaptation (UniDA) targets knowledge transfer in the presence of both covariate and label shifts. Recently, Source-free Universal Domain Adaptation (SF-UniDA) has emerged to achieve UniDA without access to source data, which tends to be more practical due to data protection policies. The main challenge lies in determining whether covariate-shifted samples belong to target-private unknown categories. Existing methods tackle this either through hand-crafted thresholding or by developing time-consuming iterative clustering strategies. In this paper, we propose a new idea of LEArning Decomposition (LEAD), which decouples features into source-known and -unknown components to identify target-private data. Technically, LEAD initially leverages the orthogonal decomposition analysis for feature decomposition. Then, LEAD builds instance-level decision boundaries to adaptively identify target-private data. Extensive experiments across various UniDA scenarios have demonstrated the effectiveness and superiority of LEAD. Notably, in the OPDA scenario on VisDA dataset, LEAD outperforms GLC by 3.5% overall H-score and reduces 75% time to derive pseudo-labeling decision boundaries. Besides, LEAD is also appealing in that it is complementary to most existing methods. The code is available at https://github.com/ispc-lab/LEAD.
Paper Structure (22 sections, 13 equations, 8 figures, 7 tables)

This paper contains 22 sections, 13 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: (a) Pseudo-label accuracy curves for target-private data in the OPDA scenario on VisDA dataset. An observation is that clustering-based GLC does not work well in deriving decision boundaries for target-private data. This may be due to the curse of dimensionality, leading K-means to faintly discerning clusters. (b) Frequency distribution for the normalized feature magnitude of target data in the source-unknown space (orthogonal complement of the space spanned by source model weights). Features are on task ($A\rightarrow D$) in the OPDA scenario of Office-31. The results show that target-private data are expected to involve more components from source-unknown space, even with covariate shifts.
  • Figure 2: (a) Illustrations of traditional universal domain adaptation (UniDA) and source-free universal domain adaptation (SF-UniDA). Traditional UniDA methods necessitate data from both source and target domains concurrently. In SF-UniDA, source data are solely utilized for pre-training. Adaptation is performed by harnessing target data and the source model $f^s_\theta = h^s_\theta \circ g^s_\theta$. (b) An overview of our Learning Decomposition (LEAD) framework. Pseudo-labeling is an important technique for UniDA and SF-UniDA. The primary objective is to recognize target data associated with common label sets and exclude data within the target-private label space. Different from existing methods that perform private data identification by hand-crafted thresholding on predictions or iterative global clustering, we tackle this from the viewpoint of feature decomposition. The rationale is that despite potential shifts in the feature space, target-private data are expected to encompass more components from the orthogonal complement (source-unknown) space of the source model. Technically, LEAD first performs orthogonal decomposition to decompose target features into source-known and -unknown parts, i.e., $\mathbf{z}^t_{i, knw}$ and $\mathbf{z}^t_{i, unk}$. $\lVert\mathbf{z}^t_{i, unk}\rVert_2$ is considered as an indicator for private data. Next, LEAD employs a two-component Gaussian Mixture Model to estimate the distribution of $\lVert\mathbf{z}^t_{i, unk}\rVert_2$. Thereafter, LEAD devises a metric named "common score" $\epsilon_{i, c}$ that accounts for distances to both target prototypes and source anchors (derived from $h^t_{\theta}$) to facilitate deriving instance-level decision boundary $\rho_{i, c}$. LEAD provides an elegant solution to distinguish target-private data, mitigating the need for tedious hand-crafted threshold tuning or dependence on time-consuming iterative clustering. LEAD could also serve as a complementary approach to most existing SF-UniDA methods.
  • Figure 3: Robustness analysis. (a) shows the sensitivity to $\lambda$ on Office-31. (b) presents the H-score curves on VisDA. (c-d) present robustness analysis when varying the unknown private categories.
  • Figure 4: Methodology analysis. (a) compares the effectiveness of our instance-level decision strategy against the vanilla global decision strategy. (b) examines the efficacy of using Entropy as the indicator for private data.
  • Figure 5: Representative examples from the benchmark datasets used in our study, illustrating various types of domain shift. The selected samples highlight the distinct characteristics and environments of the domains within each dataset.
  • ...and 3 more figures