Table of Contents
Fetching ...

COD: Learning Conditional Invariant Representation for Domain Adaptation Regression

Hao-Ran Yang, Chuan-Xian Ren, You-Wei Luo

TL;DR

The paper tackles domain adaptation for regression with continuous labels by establishing a sufficiency theory: the cross-domain conditional discrepancy for $P_{X|Y}$ governs the generalization error. It introduces COD, a kernel-based metric that captures differences between conditional distributions via mean embeddings and covariance operators, and proves a zero COD implies identical conditional laws. Building on this, it proposes a COD-based representation learning framework with a discriminability-enhanced COD_mod term and a Kernel Gaussian Wasserstein (KGW) discrepancy, optimized together with a source regression loss $\mathcal{L}_\mathrm{src}$. Empirical results on dSprites, Biwi Kinect, and MPI3D show COD yields state-of-the-art performance and robust gains from conditional alignment, validating both the theory and the method's practical impact for real-world DAR tasks.

Abstract

Aiming to generalize the label knowledge from a source domain with continuous outputs to an unlabeled target domain, Domain Adaptation Regression (DAR) is developed for complex practical learning problems. However, due to the continuity problem in regression, existing conditional distribution alignment theory and methods with discrete prior, which are proven to be effective in classification settings, are no longer applicable. In this work, focusing on the feasibility problems in DAR, we establish the sufficiency theory for the regression model, which shows the generalization error can be sufficiently dominated by the cross-domain conditional discrepancy. Further, to characterize conditional discrepancy with continuous conditioning variable, a novel Conditional Operator Discrepancy (COD) is proposed, which admits the metric property on conditional distributions via the kernel embedding theory. Finally, to minimize the discrepancy, a COD-based conditional invariant representation learning model is proposed, and the reformulation is derived to show that reasonable modifications on moment statistics can further improve the discriminability of the adaptation model. Extensive experiments on standard DAR datasets verify the validity of theoretical results and the superiority over SOTA DAR methods.

COD: Learning Conditional Invariant Representation for Domain Adaptation Regression

TL;DR

The paper tackles domain adaptation for regression with continuous labels by establishing a sufficiency theory: the cross-domain conditional discrepancy for governs the generalization error. It introduces COD, a kernel-based metric that captures differences between conditional distributions via mean embeddings and covariance operators, and proves a zero COD implies identical conditional laws. Building on this, it proposes a COD-based representation learning framework with a discriminability-enhanced COD_mod term and a Kernel Gaussian Wasserstein (KGW) discrepancy, optimized together with a source regression loss . Empirical results on dSprites, Biwi Kinect, and MPI3D show COD yields state-of-the-art performance and robust gains from conditional alignment, validating both the theory and the method's practical impact for real-world DAR tasks.

Abstract

Aiming to generalize the label knowledge from a source domain with continuous outputs to an unlabeled target domain, Domain Adaptation Regression (DAR) is developed for complex practical learning problems. However, due to the continuity problem in regression, existing conditional distribution alignment theory and methods with discrete prior, which are proven to be effective in classification settings, are no longer applicable. In this work, focusing on the feasibility problems in DAR, we establish the sufficiency theory for the regression model, which shows the generalization error can be sufficiently dominated by the cross-domain conditional discrepancy. Further, to characterize conditional discrepancy with continuous conditioning variable, a novel Conditional Operator Discrepancy (COD) is proposed, which admits the metric property on conditional distributions via the kernel embedding theory. Finally, to minimize the discrepancy, a COD-based conditional invariant representation learning model is proposed, and the reformulation is derived to show that reasonable modifications on moment statistics can further improve the discriminability of the adaptation model. Extensive experiments on standard DAR datasets verify the validity of theoretical results and the superiority over SOTA DAR methods.
Paper Structure (10 sections, 5 theorems, 17 equations, 4 figures, 4 tables)

This paper contains 10 sections, 5 theorems, 17 equations, 4 figures, 4 tables.

Key Result

theorem thmcountertheorem

For representation variable $Z = g(X)$, suppose inequality $d_{JS}(P_Y^s,P_Y^t) \geq d_{JS}(P_Z^s,P_Z^t)$ holds, then for predictor $h: Z \to Y$, we have where $d_{JS}$ denotes the Jensen-Shannon divergence.

Figures (4)

  • Figure 1: Illustration of the conditional shift in regression setting, where label value $y\in[0,1]$ and color gradients imply the continuity. (a) Before adaptation, distribution shift exists between source and target representations so the predictor trained on source domain cannot generalize to target domain. (b) After marginal distribution alignment, distributions of source and target domain are globally aligned. However, representations with different labels may be falsely aligned across domains, leading to significant inconsistency between cross-domain labeling rules. Thus, level sets provided by the source predictor are not suitable for target representations. (c) After conditional distribution alignment, label-wise matching is achieved, where the level sets of cross-domain labeling rules are consistent and the source predictor will be reliable.
  • Figure 2: Illustration of COD metric. (a) Classification DA methods usually correct conditional shift by class-wise computations on discrete clusters $P_{X|y}$, which is infeasible for regression due to the infinite slices $P_{X|y}$. (b) In COD, the continuous conditional distributions are embedded into the RKHS and characterized by finite statistical moments in kernel spaces, which do not rely on specific conditions $y$. Under the guarantees of distribution embedding property, the conditional alignment is equivalent to the matching on first-order and second-order statistics, i.e., mean embedding operator $\mathcal{U}_{X|Y}$ and covariance operator $\mathcal{C}_{XX|Y}$. When COD is minimized, conditional alignment is achieved, and the level sets of cross-domain labeling rules are then aligned.
  • Figure 3: t-SNE visualization of learned representations. Label values are denoted by color gradients. Eight label values are selected from the range of the variable for visualization. '$+$': source samples; '$\circ$': target samples.
  • Figure 4: MAE values under different settings of hyper-parameters.

Theorems & Definitions (10)

  • remark thmcounterremark
  • theorem thmcountertheorem: zhao2019on
  • definition thmcounterdefinition
  • theorem thmcountertheorem
  • definition thmcounterdefinition: COD
  • proposition thmcounterproposition
  • proposition thmcounterproposition
  • proposition thmcounterproposition
  • remark thmcounterremark
  • remark thmcounterremark