On a synergistic learning phenomenon in nonparametric domain adaptation
Ling Zhou, Yuhong Yang
TL;DR
This work analyzes nonparametric regression under covariate shift with potentially unbounded likelihood ratios, deriving the minimax transfer learning rates when using both source and target data. A key contribution is the discovery of a synergistic learning phenomenon (SLP): under certain density-exponent regimes and when the target sample is not too small relative to the source, combining data yields rates faster than either dataset alone. The results cover known and unknown density parameters, extend to density singularities with transfer-singularity points, and include adaptive procedures for unknown smoothness $\beta$, backed by simulations that validate the theoretical rates. The findings illuminate how density inhomogeneity governs when and how source and target data can synergistically improve learning in domain adaptation for regression. The methods hinge on Hölder-smooth regression functions, Gaussian noise, and density-weighted risks $R_{h_T}$, with optimally tuned estimators (NW and local polynomials) and a spread function $t_n(x)$ governing local bandwidths.
Abstract
Consider nonparametric domain adaptation for regression, which assumes the same conditional distribution of the response given the covariates but different marginal distributions of the covariates. An important goal is to understand how the source data may improve the minimax convergence rate of learning the regression function when the likelihood ratio of the covariate marginal distributions of the target data and the source data are unbounded. A previous work of Pathak et al. (2022) show that the minimax transfer learning rate is simply determined by the faster rate of using either the source or the target data alone. In this paper, we present a new synergistic learning phenomenon (SLP) that the minimax convergence rate based on both data may sometimes be faster (even much faster) than the better rate of convergence based on the source or target data only. The SLP occurs when and only when the target sample size is smaller (in order) than but not too much smaller than the source sample size in relation to the smoothness of the regression function and the nature of the covariate densities of the source and target distributions. Interestingly, the SLP happens in two different ways according to the relationship between the two sample sizes. One is that the target data help alleviate the difficulty in estimating the regression function at points where the density of the source data is close to zero and the other is that the source data (with its larger sample size than that of the target data) help the estimation at points where the density of the source data is not small. Extensions to handle unknown source and target parameters and smoothness of the regression function are also obtained.
