Incremental Pseudo-Labeling for Black-Box Unsupervised Domain Adaptation

Yawen Zou; Chunzhi Gu; Jun Yu; Shangce Gao; Chao Zhang

Incremental Pseudo-Labeling for Black-Box Unsupervised Domain Adaptation

Yawen Zou, Chunzhi Gu, Jun Yu, Shangce Gao, Chao Zhang

TL;DR

This work tackles Black-Box Unsupervised Domain Adaptation (BBUDA), where only predictions from a pre-trained source model are accessible. It introduces an incremental pseudo-labeling framework that first builds a crude target model from source predictions and then progressively enlarges a high-confidence pseudo-label set using three criteria: softmax-thresholding with $\alpha$, prototype labels derived via nearest-centroids through teacher-student distillation with a curvature-based threshold $\beta$, and intra-class similarity with threshold $\theta$. The selected high-confidence samples train a stronger target network to correct remaining low-confidence labels, guided by a composite objective $L_t = L_{kd} + L_{im} + L_{mix}$ and followed by mutual information-based finetuning. Experiments on Office, Office-Home, and VisDA-C show state-of-the-art BBUDA performance and validate the contributions through ablations and parameter studies, while highlighting potential class-imbalance issues and directions for future work on weighted losses.

Abstract

Black-Box unsupervised domain adaptation (BBUDA) learns knowledge only with the prediction of target data from the source model without access to the source data and source model, which attempts to alleviate concerns about the privacy and security of data. However, incorrect pseudo-labels are prevalent in the prediction generated by the source model due to the cross-domain discrepancy, which may substantially degrade the performance of the target model. To address this problem, we propose a novel approach that incrementally selects high-confidence pseudo-labels to improve the generalization ability of the target model. Specifically, we first generate pseudo-labels using a source model and train a crude target model by a vanilla BBUDA method. Second, we iteratively select high-confidence data from the low-confidence data pool by thresholding the softmax probabilities, prototype labels, and intra-class similarity. Then, we iteratively train a stronger target network based on the crude target model to correct the wrongly labeled samples to improve the accuracy of the pseudo-label. Experimental results demonstrate that the proposed method achieves state-of-the-art black-box unsupervised domain adaptation performance on three benchmark datasets.

Incremental Pseudo-Labeling for Black-Box Unsupervised Domain Adaptation

TL;DR

, prototype labels derived via nearest-centroids through teacher-student distillation with a curvature-based threshold

, and intra-class similarity with threshold

. The selected high-confidence samples train a stronger target network to correct remaining low-confidence labels, guided by a composite objective

and followed by mutual information-based finetuning. Experiments on Office, Office-Home, and VisDA-C show state-of-the-art BBUDA performance and validate the contributions through ablations and parameter studies, while highlighting potential class-imbalance issues and directions for future work on weighted losses.

Abstract

Paper Structure (14 sections, 15 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 14 sections, 15 equations, 4 figures, 4 tables, 1 algorithm.

Introduction
Related work
Method
Source model and crude target model generation
Selection of high-confidence data
Thresholding of softmax probabilities
Prototype labels
Intra-class similarity
Overall Objectives
Experiment
Setup
Results
Analysis
Conclusion

Figures (4)

Figure 1: Illustration of different UDA settings. Source data is required in traditional UDA, whereas WBUDA is source data free but requires access to the source model. BBUDA requires the prediction of target data from the black box without access to both the source data and the source model.
Figure 2: Overview of the proposed framework, consisting of warm-up and incremental processes. For the warm-up process, a trained source network provides the predictions of target data (soft label $\hat{y}^{src}$ and hard label $\hat{y}^{src}_{\mathds{1}}$). A student network is distilled from the source model to extract the features $X$ of the target data and obtain prototype labels $\hat{y}^{t}_{\mathds{1}p}$. Then, high-confidence samples are selected and passed to the incremental process. For the incremental process, the $r$-th cycle's prototype labels and features $X$ are updated by the target network $\phi^{r}$ instead of the student work. Then, high-confidence data are selected from the low-confidence data pool in each iteration via pseudo-label consistency and intra-class similarity until the percentage of low-confidence data is less than $\lambda$.
Figure 3: Illustration of strategies used for selecting high-confidence data.
Figure 4: Parameter Analysis

Incremental Pseudo-Labeling for Black-Box Unsupervised Domain Adaptation

TL;DR

Abstract

Incremental Pseudo-Labeling for Black-Box Unsupervised Domain Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)