Inter-Domain Mixup for Semi-Supervised Domain Adaptation

Jichang Li; Guanbin Li; Yizhou Yu

Inter-Domain Mixup for Semi-Supervised Domain Adaptation

Jichang Li, Guanbin Li, Yizhou Yu

TL;DR

The paper tackles semi-supervised domain adaptation by addressing label-mismatch during cross-domain alignment. It introduces IDMNE, which combines Inter-domain Mixup (SDM and MDM) to inject reliable cross-domain supervision with Neighborhood Expansion (PSR, NSR, PA) to leverage high-confidence pseudo-labels from the target domain. The approach achieves superior results on DomainNet, Office-Home, and Office-31 across multiple backbones and settings, supported by ablations, calibration analysis, and an ACCD-based assessment of cross-domain alignment. IDMNE's fusion of label-aware cross-domain mixing and pseudo-label-driven expansion offers a practical path to more discriminative, domain-invariant representations in SSDA, with strong empirical gains and robust calibration benefits.

Abstract

Semi-supervised domain adaptation (SSDA) aims to bridge source and target domain distributions, with a small number of target labels available, achieving better classification performance than unsupervised domain adaptation (UDA). However, existing SSDA work fails to make full use of label information from both source and target domains for feature alignment across domains, resulting in label mismatch in the label space during model testing. This paper presents a novel SSDA approach, Inter-domain Mixup with Neighborhood Expansion (IDMNE), to tackle this issue. Firstly, we introduce a cross-domain feature alignment strategy, Inter-domain Mixup, that incorporates label information into model adaptation. Specifically, we employ sample-level and manifold-level data mixing to generate compatible training samples. These newly established samples, combined with reliable and actual label information, display diversity and compatibility across domains, while such extra supervision thus facilitates cross-domain feature alignment and mitigates label mismatch. Additionally, we utilize Neighborhood Expansion to leverage high-confidence pseudo-labeled samples in the target domain, diversifying the label information of the target domain and thereby further increasing the performance of the adaptation model. Accordingly, the proposed approach outperforms existing state-of-the-art methods, achieving significant accuracy improvements on popular SSDA benchmarks, including DomainNet, Office-Home, and Office-31.

Inter-Domain Mixup for Semi-Supervised Domain Adaptation

TL;DR

Abstract

Paper Structure (18 sections, 12 equations, 9 figures, 8 tables, 1 algorithm)

This paper contains 18 sections, 12 equations, 9 figures, 8 tables, 1 algorithm.

Introduction
Related Work
Domain Adaptation
Semi-supervised Domain Adaptation
Data Mixup
The Proposed Method
Problem Formulation and Notation
Inter-Domain Mixup
Neighborhood Expansion
Self-Regularization
Pairwise Approaching
Overall Loss Function
Experiments
Datasets
Experimental Protocols
...and 3 more sections

Figures (9)

Figure 1: A conceptual description of our basic idea. Sample points in brown, green, and red represent source domain data, target domain data, and class prototypes, respectively. Arrows in purple indicate that sample points move towards the prototype of Class 1, while blue arrows illustrate that the prototype of Class 2 attracts samples from the corresponding class towards itself. Left: Previous label-free strategies to enforce domain-level feature alignment fail to generate discriminative target features, thereby giving rise to cross-domain label mismatch in label space. Middle: Our approach incorporates label information into adaptation, and thus, the model can align class-wise sample features from both domains with the aid of their class labels. Right: The proposed approach enables the model to produce domain-invariant and discriminative features and thus enhance the performance of the model.
Figure 2: An overview of our proposed Inter-domain Mixup with Neighborhood Expansion for semi-supervised domain adaptation. We use arrows with different line styles to represent data flow, where the black arrow denotes labeled source or target domain data, and the red arrow indicates mixed sample or mixed feature. Our model includes an extractor for feature generation and a classifier for object classification. Also, we train our model with six loss terms in which $\mathcal{L}_{\bm{sup}}$ is for supervision over labeled data from both domains; $\mathcal{L}_{\bm{sdm}}$ and $\mathcal{L}_{\bm{mdm}}$ are for Inter-domain Mixup to perform cross-domain class-wise feature alignment; and the remaining $\mathcal{L}_{\bm{psr}}$, $\mathcal{L}_{\bm{nsr}}$ and $\mathcal{L}_{\bm{pa}}$ are for Neighborhood Expansion to make unlabeled target domain data more confident. To further leverage unlabeled samples in the target domain, we also employ pseudo labeling to assign pseudo-labels to unlabeled target domain samples with high probability scores and merge the selected pseudo-labeled target domain samples into the labeled target domain set.
Figure 3: A flow diagram of Inter-domain Mixup. A solid line represents a flow of sample-level data mixing (SDM), while a dashed line indicates a flow of manifold-level data mixing (MDM). $\mathcal{M}_{\lambda_{1}}(\cdot;\cdot)$ and $\mathcal{M}_{\lambda_{2}}(\cdot;\cdot)$ are two mixup functions where $\lambda_{1}$ and $\lambda_{2}$ are two different mixup ratios. For the SDM flow, we obtain a mixup sample $(x^m, y^m)$ by mixing a source-target sample pair containing a labeled source domain sample $(x^s, y^s)$ and a labeled target domain sample $(x^t, y^t)$ through a linearly convex interpolation. For the MDM flow, two feature representations $f^s$ and $f^t$ along with their original labels $y^s$ and $y^t$ are mixed to generate an augmented feature $f^m$ and its associated label $\tilde{y}^m$. Afterwards, extra supervision over these two types of mixup points, i.e., $(x^m, y^m)$ and $(f^m, \tilde{y}^m)$, is performed via a standard cross-entropy loss function.
Figure 4: Illustrations of three schemes applied in Neighborhood Expansion to encourage low-entropy and high-confidence predictions for unlabeled target domain samples. (a) Positive self-regularization learning (PSR) introduces self-training to augment the model's robustness. (b) Negative self-regularization learning (NSR) is to raise the predictive probabilities of each class except for the class corresponding to the lowest predicted probability scores. (c) Pairwise Approaching (PA) aims to drive high-confidence unlabeled target domain samples towards labeled data of the same class in the target domain. Note that PSR and PA handle samples from $\mathcal{D}_{u}$ whose confidence scores of their predicted class labels are above the confidence threshold $\tau$, while NSR is of importance for unlabeled target domain samples with confidence scores lower than $\tau$.
Figure 5: Hyper-parameter sensitivity to confidence threshold $\tau$. We show the evolution of (a) the test accuracy in the target domain w.r.t different setting of $\tau$, (b) the number of pseudo-labels involved in samples from $\mathcal{D}_{u}$ with maximum class probability prediction larger than $\tau$, (c) the number of correct pseudo-labels, and (d) the correction accuracy of pseudo-labels by comparing (b) with (c), while varying the confidence threshold $\tau$. Various colors denote different values with respect to $\tau$. We carry out these experiments on DomainNet in the adaptation scenario "$R \rightarrow S$" under the 3-shot setting using ResNet-34.
...and 4 more figures

Inter-Domain Mixup for Semi-Supervised Domain Adaptation

TL;DR

Abstract

Inter-Domain Mixup for Semi-Supervised Domain Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (9)