Table of Contents
Fetching ...

Mitigating Negative Transfer via Reducing Environmental Disagreement

Hui Sun, Zheng Xie, Hao-Yuan He, Ming Li

TL;DR

This work tackles negative transfer in unsupervised domain adaptation by reframing transfer through causal disentanglement, identifying non-causal environmental features as the root cause of cross-domain misgeneralization. It introduces RED, a framework that decomposes samples into domain-invariant causal features and domain-specific environmental features, with adversarially trained domain-specific extractors and a learned mixing coefficient to suppress environmental shortcuts. The authors derive a new target-error bound that explicitly includes environmental disagreement via a transition matrix $M$, and they operationalize this insight by estimating $\widehat{M}$ and minimizing $(1-\lambda)(1-\mathrm{tr}(\widehat{M}))$ during training. Empirical results on Office-31, Office-Home, and DomainNet show state-of-the-art performance across diverse backbones (ResNet, DeiT, ViT), with ablations confirming the contribution of environmental-disagreement reduction to improved cross-domain transfer.

Abstract

Unsupervised Domain Adaptation~(UDA) focuses on transferring knowledge from a labeled source domain to an unlabeled target domain, addressing the challenge of \emph{domain shift}. Significant domain shifts hinder effective knowledge transfer, leading to \emph{negative transfer} and deteriorating model performance. Therefore, mitigating negative transfer is essential. This study revisits negative transfer through the lens of causally disentangled learning, emphasizing cross-domain discriminative disagreement on non-causal environmental features as a critical factor. Our theoretical analysis reveals that overreliance on non-causal environmental features as the environment evolves can cause discriminative disagreements~(termed \emph{environmental disagreement}), thereby resulting in negative transfer. To address this, we propose Reducing Environmental Disagreement~(RED), which disentangles each sample into domain-invariant causal features and domain-specific non-causal environmental features via adversarially training domain-specific environmental feature extractors in the opposite domains. Subsequently, RED estimates and reduces environmental disagreement based on domain-specific non-causal environmental features. Experimental results confirm that RED effectively mitigates negative transfer and achieves state-of-the-art performance.

Mitigating Negative Transfer via Reducing Environmental Disagreement

TL;DR

This work tackles negative transfer in unsupervised domain adaptation by reframing transfer through causal disentanglement, identifying non-causal environmental features as the root cause of cross-domain misgeneralization. It introduces RED, a framework that decomposes samples into domain-invariant causal features and domain-specific environmental features, with adversarially trained domain-specific extractors and a learned mixing coefficient to suppress environmental shortcuts. The authors derive a new target-error bound that explicitly includes environmental disagreement via a transition matrix , and they operationalize this insight by estimating and minimizing during training. Empirical results on Office-31, Office-Home, and DomainNet show state-of-the-art performance across diverse backbones (ResNet, DeiT, ViT), with ablations confirming the contribution of environmental-disagreement reduction to improved cross-domain transfer.

Abstract

Unsupervised Domain Adaptation~(UDA) focuses on transferring knowledge from a labeled source domain to an unlabeled target domain, addressing the challenge of \emph{domain shift}. Significant domain shifts hinder effective knowledge transfer, leading to \emph{negative transfer} and deteriorating model performance. Therefore, mitigating negative transfer is essential. This study revisits negative transfer through the lens of causally disentangled learning, emphasizing cross-domain discriminative disagreement on non-causal environmental features as a critical factor. Our theoretical analysis reveals that overreliance on non-causal environmental features as the environment evolves can cause discriminative disagreements~(termed \emph{environmental disagreement}), thereby resulting in negative transfer. To address this, we propose Reducing Environmental Disagreement~(RED), which disentangles each sample into domain-invariant causal features and domain-specific non-causal environmental features via adversarially training domain-specific environmental feature extractors in the opposite domains. Subsequently, RED estimates and reduces environmental disagreement based on domain-specific non-causal environmental features. Experimental results confirm that RED effectively mitigates negative transfer and achieves state-of-the-art performance.

Paper Structure

This paper contains 21 sections, 5 theorems, 26 equations, 5 figures, 5 tables, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{H}$ be the hypothesis space with VC-dimension $d$, and let $\hat{\mathcal{D}}_S$ (resp. $\hat{\mathcal{D}}_T$) represent the empirical distribution induced by a sample of size $n$ drawn from $\mathcal{D}_S$ (resp. $\mathcal{D}_T$). Then, with probability at least $1 - \delta$, for all where $\gamma = \min_{f^* \in \mathcal{H}} \left[\epsilon_S(f^*) + \epsilon_T(f^*) \right]$ represe

Figures (5)

  • Figure 1: Disentanglement comparison: Fig. \ref{['fig:classic_dis']} illustrates feature disentanglement in classic UDA, which assumes absolute independence between domain-specific and class-specific features. Fig. \ref{['fig:causally_dis']} presents a more general perspective on feature disentanglement through causal inference, capturing the joint interactions between domain and class.
  • Figure 2: The Framework of RED. Two datasets have different fonts, and display environmental features using colored backgrounds. Odd numbers in the source have red backgrounds, while even numbers have blue backgrounds. The target exhibits the opposite pattern.
  • Figure 3: Convergence of RED w/o $\mathcal{L}_{tr}$ on Ar $\rightarrow$ Cl.
  • Figure 4: Fig. \ref{['fig:hyper_sen']} shows hyper-parameter sensitivity; Fig. \ref{['fig:a_distance']} shows distribution discrepancy with $\mathcal{A}$-distance for tasks Ar$\rightarrow$Cl and Ar$\rightarrow$Pr on Office-Home; Fig. \ref{['fig:test_error']} displays a comparison of convergence in test error; and Fig. \ref{['fig:test_trace']} illustrates the convergence of $\lambda$ and $\mathrm{tr}(M)$ in Eq. \ref{['equ:L_tr']}.
  • Figure 5: Feature visualizations of CDAN (left) and RED (right): We chose the first 20 classes from Office-Home for the Ar$\rightarrow$Cl task to ensure clarity.

Theorems & Definitions (7)

  • Definition 1: Hypothesis Error
  • Theorem 1: Upper Bound on Expected Error in the Target Domain by Ben-David2010
  • Theorem 2: Upper Bound on Expected Error Considering Negative Transfer with Environmental Disagreement
  • proof
  • Lemma 1: Upper Bound on Distribution Discrepancy zhao2019learning
  • Lemma 2: Empirical Error Bound of Labeling
  • Lemma 3: Empirical Error Bound of Distribution Discrepancy zhao2019learning