Model-Robust and Adaptive-Optimal Transfer Learning for Tackling Concept Shifts in Nonparametric Regression
Haotian Lin, Matthew Reimherr
TL;DR
This work tackles nonparametric regression under concept shifts by introducing a robust Hypothesis Transfer Learning framework that leverages spectral algorithms with fixed-bandwidth Gaussian kernels. It establishes that such kernels enable minimax-optimal convergence rates for Sobolev-function targets and supports adaptive excess-risk rates up to logarithmic factors, even under misspecification. Building on this, the authors develop RAHTL, a transfer-learning scheme that jointly optimizes pre-training on source data and fine-tuning on target-like shifts, achieving minimax optimality up to logs and revealing a phase-transition governed by the transfer-signal ratio $ ext{ξ}$. Theoretical lower and upper bounds decompose errors into pre-training and fine-tuning contributions, and numerical experiments corroborate the理论, demonstrating transfer gains and the critical influence of similarity between source and target functions. Overall, the paper provides robust, adaptive transfer-learning guarantees for tackling concept shifts in nonparametric regression with RKHS-based methods, with practical implications for settings where labeled target data are scarce.
Abstract
When concept shifts and sample scarcity are present in the target domain of interest, nonparametric regression learners often struggle to generalize effectively. The technique of transfer learning remedies these issues by leveraging data or pre-trained models from similar source domains. While existing generalization analyses of kernel-based transfer learning typically rely on correctly specified models, we present a transfer learning procedure that is robust against model misspecification while adaptively attaining optimality. To facilitate our analysis and avoid the risk of saturation found in classical misspecified results, we establish a novel result in the misspecified single-task learning setting, showing that spectral algorithms with fixed bandwidth Gaussian kernels can attain minimax convergence rates given the true function is in a Sobolev space, which may be of independent interest. Building on this, we derive the adaptive convergence rates of the excess risk for specifying Gaussian kernels in a prevalent class of hypothesis transfer learning algorithms. Our results are minimax optimal up to logarithmic factors and elucidate the key determinants of transfer efficiency.
