Table of Contents
Fetching ...

Partial Identifiability for Domain Adaptation

Lingjing Kong, Shaoan Xie, Weiran Yao, Yujia Zheng, Guangyi Chen, Petar Stojanov, Victor Akinwande, Kun Zhang

TL;DR

This work tackles unsupervised domain adaptation by addressing the identifiability challenge of cross-domain joint distributions. It proposes a latent-variable data-generating process that partitions latent space into invariant $\mathbf{z}_{c}$ and changing $\mathbf{z}_{s}$ components, with a high-level invariant $\tilde{\mathbf{z}}_{s}$ mapping via a monotonic function to model domain shifts under a minimal-change constraint. The authors prove partial identifiability of the changing components, the invariant subspace, and the joint distribution in the shared space, then implement iMSDA, a VAE-based framework with a flow model that recovers latent variables and enables robust prediction in the target domain. Empirical results on synthetic data and real-world benchmarks (PACS and Office-Home) show state-of-the-art performance and validate the theoretical identifiability claims, with ablations clarifying the roles of loss terms and the changing-part dimension. Overall, the paper provides a principled approach for principled domain alignment and target-domain prediction in multi-source UDA, with practical effectiveness demonstrated across diverse datasets.

Abstract

Unsupervised domain adaptation is critical to many real-world applications where label information is unavailable in the target domain. In general, without further assumptions, the joint distribution of the features and the label is not identifiable in the target domain. To address this issue, we rely on the property of minimal changes of causal mechanisms across domains to minimize unnecessary influences of distribution shifts. To encode this property, we first formulate the data-generating process using a latent variable model with two partitioned latent subspaces: invariant components whose distributions stay the same across domains and sparse changing components that vary across domains. We further constrain the domain shift to have a restrictive influence on the changing components. Under mild conditions, we show that the latent variables are partially identifiable, from which it follows that the joint distribution of data and labels in the target domain is also identifiable. Given the theoretical insights, we propose a practical domain adaptation framework called iMSDA. Extensive experimental results reveal that iMSDA outperforms state-of-the-art domain adaptation algorithms on benchmark datasets, demonstrating the effectiveness of our framework.

Partial Identifiability for Domain Adaptation

TL;DR

This work tackles unsupervised domain adaptation by addressing the identifiability challenge of cross-domain joint distributions. It proposes a latent-variable data-generating process that partitions latent space into invariant and changing components, with a high-level invariant mapping via a monotonic function to model domain shifts under a minimal-change constraint. The authors prove partial identifiability of the changing components, the invariant subspace, and the joint distribution in the shared space, then implement iMSDA, a VAE-based framework with a flow model that recovers latent variables and enables robust prediction in the target domain. Empirical results on synthetic data and real-world benchmarks (PACS and Office-Home) show state-of-the-art performance and validate the theoretical identifiability claims, with ablations clarifying the roles of loss terms and the changing-part dimension. Overall, the paper provides a principled approach for principled domain alignment and target-domain prediction in multi-source UDA, with practical effectiveness demonstrated across diverse datasets.

Abstract

Unsupervised domain adaptation is critical to many real-world applications where label information is unavailable in the target domain. In general, without further assumptions, the joint distribution of the features and the label is not identifiable in the target domain. To address this issue, we rely on the property of minimal changes of causal mechanisms across domains to minimize unnecessary influences of distribution shifts. To encode this property, we first formulate the data-generating process using a latent variable model with two partitioned latent subspaces: invariant components whose distributions stay the same across domains and sparse changing components that vary across domains. We further constrain the domain shift to have a restrictive influence on the changing components. Under mild conditions, we show that the latent variables are partially identifiable, from which it follows that the joint distribution of data and labels in the target domain is also identifiable. Given the theoretical insights, we propose a practical domain adaptation framework called iMSDA. Extensive experimental results reveal that iMSDA outperforms state-of-the-art domain adaptation algorithms on benchmark datasets, demonstrating the effectiveness of our framework.
Paper Structure (37 sections, 4 theorems, 25 equations, 4 figures, 6 tables, 1 algorithm)

This paper contains 37 sections, 4 theorems, 25 equations, 4 figures, 6 tables, 1 algorithm.

Key Result

Theorem 4.1

We follow the data-generation process in Equation eq:data_generating_process and make the following assumptions: By learning $( \hat{g}, \, p_{\hat{\mathbf{z}}_{c}}, \, p_{ \hat{{\mathbf z}}_{s} | \mathbf{u} } )$ to achieve Equation eq:estimated_condition, ${\mathbf z}_{s}$ is component-wise identifiable.

Figures (4)

  • Figure 1: The generating process: The gray shade of nodes indicates that the variable is observable.
  • Figure 2: Diagram of our proposed method, iMSDA. We first apply the VAE encoder $(f_{\mu}, f_{\Sigma})$ to encode ${\mathbf x}$ into $(\hat{{\mathbf z}}_{c}, \hat{{\mathbf z}}_{s})$, which is further fed into the decoder $\hat{g}$ for reconstruction. In parallel, the changing part $\hat{{\mathbf z}}_{s}$ is passed through the flow model $f_{{\mathbf u}}$ to recover the high-level invariant variable $\hat{\tilde{{\mathbf z}}}_{s}$. We use $(\hat{{\mathbf z}}_{c}, \hat{\tilde{{\mathbf z}}}_{s})$ for classification with the classifier $f_{\text{cls}}$ and for matching $\mathcal{N}(\mathbf{0}, \mathbf{I})$ with a KL loss.
  • Figure 3: The scatter plot for the true and the estimated components from our method trained with 9 domains. The $y$ (resp. $x$) axe of each subplot corresponds to a specific estimated (resp. true) latent variable. We can observe that the changing components can be identified in a component-wise manner (subplots "Estimated S" and "True S"), which verifies our Theorem \ref{['thm:changing_part_identifiability']}. The true invariant components can be (partial) identified within its own subspace (subplots "Estimated C" and "True C") while influencing the changing components minimally, adhering to Theorem \ref{['thm:block_identifiability_content']}.
  • Figure 4: The t-SNE visualizations of learned features on the $\rightarrow$ Sketch task in PACS. Red: source domains, Blue: target domain.

Theorems & Definitions (6)

  • Theorem 4.1
  • Theorem 4.2
  • Theorem 1.1
  • proof
  • Theorem 1.1
  • proof