CASUAL: Conditional Support Alignment for Domain Adaptation with Label Shift

Anh T Nguyen; Lam Tran; Anh Tong; Tuan-Duy H. Nguyen; Toan Tran

CASUAL: Conditional Support Alignment for Domain Adaptation with Label Shift

Anh T Nguyen, Lam Tran, Anh Tong, Tuan-Duy H. Nguyen, Toan Tran

TL;DR

We address unsupervised domain adaptation under label distribution shift, where source and target label distributions differ while conditional feature relations can be misaligned. The authors introduce CASUAL, a Conditional Adversarial SUpport ALignment framework that targets the supports of class-conditional feature representations using a conditional symmetric support divergence (CSSD) based target risk bound. The method integrates source supervision, target entropy minimization, Lipschitz regularization, and a pseudo-label guided joint-support alignment via an adversarial discriminator, leading to a bound-driven training objective. Empirical results across USPS→MNIST, STL→CIFAR, and VisDA-2017 demonstrate robust performance under varying label shift, with ablations confirming the roles of each loss term and visualizations illustrating improved class separation and reduced CSSD. Overall, CASUAL offers a principled, label-shift-robust approach to learning discriminative representations without explicitly estimating the shifted label distribution.

Abstract

Unsupervised domain adaptation (UDA) refers to a domain adaptation framework in which a learning model is trained based on the labeled samples on the source domain and unlabeled ones in the target domain. The dominant existing methods in the field that rely on the classical covariate shift assumption to learn domain-invariant feature representation have yielded suboptimal performance under label distribution shift. In this paper, we propose a novel Conditional Adversarial SUpport ALignment (CASUAL) whose aim is to minimize the conditional symmetric support divergence between the source's and target domain's feature representation distributions, aiming at a more discriminative representation for the classification task. We also introduce a novel theoretical target risk bound, which justifies the merits of aligning the supports of conditional feature distributions compared to the existing marginal support alignment approach in the UDA settings. We then provide a complete training process for learning in which the objective optimization functions are precisely based on the proposed target risk bound. Our empirical results demonstrate that CASUAL outperforms other state-of-the-art methods on different UDA benchmark tasks under different label shift conditions.

CASUAL: Conditional Support Alignment for Domain Adaptation with Label Shift

TL;DR

Abstract

Paper Structure (15 sections, 4 theorems, 12 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 15 sections, 4 theorems, 12 equations, 3 figures, 4 tables, 1 algorithm.

Introduction
Methodology
Problem Statement
A support misalignment-based target risk bound
A novel conditional SSD-based domain adaptation bound
Training scheme for our CASUAL
Minimizing source-guided uncertainty
Enforcing Lipschitz hypothesis
Minimizing conditional symmetric support divergence
Experiments
Setup
Main results
Visualization and additional analysis
Related works
Conclusion

Key Result

Proposition 1

Assuming that $P^S(Y=y) > 0, P^T(Y=y) >0$ for any $y \in \mathcal{Y}$, then $\mathcal{D}^c_{\mathop{\mathrm{supp}}\nolimits}(P^S_{Z|Y}, P^T_{Z|Y})$ defined in Eq. eq:cssd is a support divergence.

Figures (3)

Figure 1: Illustration of the learned latent space of different domain-invariant frameworks under label shift for a binary classification problem. It can be seen that the support alignment (b) can mitigate the high error rate induced by distribution alignment (a), whereas the conditional support alignment (c) can achieve the best representation by explicitly aligning the supports of class-conditioned latent distributions.
Figure 2: Visualization of support of feature representations for 3 classes in the USPS $\to$ MNIST task. Each plot illustrates the 2 level sets of kernel density estimates for both the source and target features.
Figure 3: Left Accuracy of various algorithms during training. Right Computed CSSD (Eq. \ref{['def:cssd']}) for learned feature.

Theorems & Definitions (11)

Definition 1: Source-guided uncertainty dhouib2022connecting
Remark 1
Definition 2: Integral measure discrepancy dhouib2022connecting
Definition 3: Symmetric support divergence tong2022adversarial
Definition 4: Conditional symmetric support divergence
Proposition 1
Lemma 1: Upper bound IMD using CSSD
Theorem 1: Domain adaptation bound via CSSD
Remark 2
Remark 3
...and 1 more

CASUAL: Conditional Support Alignment for Domain Adaptation with Label Shift

TL;DR

Abstract

CASUAL: Conditional Support Alignment for Domain Adaptation with Label Shift

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (11)