Smoothness Adaptive Hypothesis Transfer Learning
Haotian Lin, Matthew Reimherr
TL;DR
The paper tackles the challenge of adapting to unknown smoothness in a two-phase transfer learning setting for nonparametric regression. It introduces Smoothness Adaptive Transfer Learning (SATL), which employs Gaussian kernels in both TL phases to adapt to the Sobolev smoothness of the target, source, and their offset. Theoretical results establish minimax lower bounds and show SATL achieves matching upper bounds up to logarithmic factors, with the excess risk decomposed into a source-term and an offset-term governed by an similarity metric $\xi(h,f_S)$. Empirical experiments corroborate the theory, demonstrating adaptive performance and superiority over non-transfer learning and finite-basis TL methods. Overall, SATL provides a principled, adaptive approach to transfer learning in infinite-dimensional settings with clear implications for how domain similarity and sample sizes influence transfer dynamics.
Abstract
Many existing two-phase kernel-based hypothesis transfer learning algorithms employ the same kernel regularization across phases and rely on the known smoothness of functions to obtain optimality. Therefore, they fail to adapt to the varying and unknown smoothness between the target/source and their offset in practice. In this paper, we address these problems by proposing Smoothness Adaptive Transfer Learning (SATL), a two-phase kernel ridge regression(KRR)-based algorithm. We first prove that employing the misspecified fixed bandwidth Gaussian kernel in target-only KRR learning can achieve minimax optimality and derive an adaptive procedure to the unknown Sobolev smoothness. Leveraging these results, SATL employs Gaussian kernels in both phases so that the estimators can adapt to the unknown smoothness of the target/source and their offset function. We derive the minimax lower bound of the learning problem in excess risk and show that SATL enjoys a matching upper bound up to a logarithmic factor. The minimax convergence rate sheds light on the factors influencing transfer dynamics and demonstrates the superiority of SATL compared to non-transfer learning settings. While our main objective is a theoretical analysis, we also conduct several experiments to confirm our results.
