Table of Contents
Fetching ...

Latent Covariate Shift: Unlocking Partial Identifiability for Multi-Source Domain Adaptation

Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

TL;DR

It is demonstrated that the latent content variable can be identified up to block identifiability due to its versatile yet distinct causal structure and anchored into a novel MSDA method, which learns the label distribution conditioned on the identifiable latent content variable, thereby accommodating more substantial distribution shifts.

Abstract

Multi-source domain adaptation (MSDA) addresses the challenge of learning a label prediction function for an unlabeled target domain by leveraging both the labeled data from multiple source domains and the unlabeled data from the target domain. Conventional MSDA approaches often rely on covariate shift or conditional shift paradigms, which assume a consistent label distribution across domains. However, this assumption proves limiting in practical scenarios where label distributions do vary across domains, diminishing its applicability in real-world settings. For example, animals from different regions exhibit diverse characteristics due to varying diets and genetics. Motivated by this, we propose a novel paradigm called latent covariate shift (LCS), which introduces significantly greater variability and adaptability across domains. Notably, it provides a theoretical assurance for recovering the latent cause of the label variable, which we refer to as the latent content variable. Within this new paradigm, we present an intricate causal generative model by introducing latent noises across domains, along with a latent content variable and a latent style variable to achieve more nuanced rendering of observational data. We demonstrate that the latent content variable can be identified up to block identifiability due to its versatile yet distinct causal structure. We anchor our theoretical insights into a novel MSDA method, which learns the label distribution conditioned on the identifiable latent content variable, thereby accommodating more substantial distribution shifts. The proposed approach showcases exceptional performance and efficacy on both simulated and real-world datasets.

Latent Covariate Shift: Unlocking Partial Identifiability for Multi-Source Domain Adaptation

TL;DR

It is demonstrated that the latent content variable can be identified up to block identifiability due to its versatile yet distinct causal structure and anchored into a novel MSDA method, which learns the label distribution conditioned on the identifiable latent content variable, thereby accommodating more substantial distribution shifts.

Abstract

Multi-source domain adaptation (MSDA) addresses the challenge of learning a label prediction function for an unlabeled target domain by leveraging both the labeled data from multiple source domains and the unlabeled data from the target domain. Conventional MSDA approaches often rely on covariate shift or conditional shift paradigms, which assume a consistent label distribution across domains. However, this assumption proves limiting in practical scenarios where label distributions do vary across domains, diminishing its applicability in real-world settings. For example, animals from different regions exhibit diverse characteristics due to varying diets and genetics. Motivated by this, we propose a novel paradigm called latent covariate shift (LCS), which introduces significantly greater variability and adaptability across domains. Notably, it provides a theoretical assurance for recovering the latent cause of the label variable, which we refer to as the latent content variable. Within this new paradigm, we present an intricate causal generative model by introducing latent noises across domains, along with a latent content variable and a latent style variable to achieve more nuanced rendering of observational data. We demonstrate that the latent content variable can be identified up to block identifiability due to its versatile yet distinct causal structure. We anchor our theoretical insights into a novel MSDA method, which learns the label distribution conditioned on the identifiable latent content variable, thereby accommodating more substantial distribution shifts. The proposed approach showcases exceptional performance and efficacy on both simulated and real-world datasets.
Paper Structure (25 sections, 3 theorems, 18 equations, 9 figures, 5 tables)

This paper contains 25 sections, 3 theorems, 18 equations, 9 figures, 5 tables.

Key Result

Proposition 4.1

Suppose that the latent causal variables $\mathbf{z}$ and the observed variable $x$ follow the latent causal models defined in Eq. expfam-mixing, given observational data distribution $p(\mathbf{y},\mathbf{x}|\mathbf{u})$, there exists an alternative solution, which can yield exactly the same observ

Figures (9)

  • Figure 1: The illustration of three different paradigms for MSDA. Covariate Shift: $p_{\mathbf{u}}(\mathbf{x})$ changes across domains, while $p_{\mathbf{u}}(\mathbf{y}|\mathbf{x})$ is invariant across domains. Conditional Shift: $p_{\mathbf{u}}(\mathbf{y})$ is invariant, while $p_{\mathbf{u}}(\mathbf{x}|\mathbf{y})$ changes across domains. Latent Covariate Shift: $p_{\mathbf{u}}(\mathbf{z}_c)$ changes across domains while $p_{\mathbf{u}}(\mathbf{y}|\mathbf{z}_c)$ is invariant.
  • Figure 2: (a) The proposed latent causal model, which splits latent noise variables $\mathbf{n}$ into two disjoint parts, $\mathbf{n}_c$ and $\mathbf{n}_s$. (b) An equivalent graph structure, which can generate the same observed data $\mathbf{x}$ as obtained by (a), resulting in a non-identifiability result.
  • Figure 3: The proposed iLCC-LCS to learn the invariant $p(\mathbf{y}|\mathbf{n}_c)$ for multiple source domain adaptation. C denotes concatenation, and S denotes sampling from the posterior distributions.
  • Figure 4: The Result on Synthetic Data. Due to the invariant conditional distribution $p(\mathbf{y}|\mathbf{n}_c)$, even with the change of $p(\mathbf{n}_c)$ as shown in Figure \ref{['syn:graph a']}, the learned $p(\mathbf{y}|\mathbf{n}_c)$ can generalize to target segment in a principle way
  • Figure 5: Classification results on resampled PACS data.
  • ...and 4 more figures

Theorems & Definitions (4)

  • Proposition 4.1
  • Proposition 4.2
  • Remark 4.3
  • Lemma 1