Table of Contents
Fetching ...

Transfer Learning for Nonparametric Regression: Non-asymptotic Minimax Analysis and Adaptive Procedure

T. Tony Cai, Hongming Pu

TL;DR

The paper addresses transfer learning for nonparametric regression under a posterior-drift model, deriving non-asymptotic minimax risk and revealing auto-smoothing and super-acceleration phenomena. It introduces the confidence-thresholding (CT) estimator and a data-driven adaptive procedure (ACT) that achieve minimax rates up to polylog factors across a wide range of smoothness and bias settings. Theoretical results provide explicit upper and lower bounds on the risk, along with phase transitions in the bias strength that govern when transfer learning helps. Empirical results from simulations and a wine-quality application validate the approach and illustrate practical gains in leveraging source-domain data for target-domain regression.

Abstract

Transfer learning for nonparametric regression is considered. We first study the non-asymptotic minimax risk for this problem and develop a novel estimator called the confidence thresholding estimator, which is shown to achieve the minimax optimal risk up to a logarithmic factor. Our results demonstrate two unique phenomena in transfer learning: auto-smoothing and super-acceleration, which differentiate it from nonparametric regression in a traditional setting. We then propose a data-driven algorithm that adaptively achieves the minimax risk up to a logarithmic factor across a wide range of parameter spaces. Simulation studies are conducted to evaluate the numerical performance of the adaptive transfer learning algorithm, and a real-world example is provided to demonstrate the benefits of the proposed method.

Transfer Learning for Nonparametric Regression: Non-asymptotic Minimax Analysis and Adaptive Procedure

TL;DR

The paper addresses transfer learning for nonparametric regression under a posterior-drift model, deriving non-asymptotic minimax risk and revealing auto-smoothing and super-acceleration phenomena. It introduces the confidence-thresholding (CT) estimator and a data-driven adaptive procedure (ACT) that achieve minimax rates up to polylog factors across a wide range of smoothness and bias settings. Theoretical results provide explicit upper and lower bounds on the risk, along with phase transitions in the bias strength that govern when transfer learning helps. Empirical results from simulations and a wine-quality application validate the approach and illustrate practical gains in leveraging source-domain data for target-domain regression.

Abstract

Transfer learning for nonparametric regression is considered. We first study the non-asymptotic minimax risk for this problem and develop a novel estimator called the confidence thresholding estimator, which is shown to achieve the minimax optimal risk up to a logarithmic factor. Our results demonstrate two unique phenomena in transfer learning: auto-smoothing and super-acceleration, which differentiate it from nonparametric regression in a traditional setting. We then propose a data-driven algorithm that adaptively achieves the minimax risk up to a logarithmic factor across a wide range of parameter spaces. Simulation studies are conducted to evaluate the numerical performance of the adaptive transfer learning algorithm, and a real-world example is provided to demonstrate the benefits of the proposed method.
Paper Structure (17 sections, 6 theorems, 37 equations, 3 figures, 2 algorithms)

This paper contains 17 sections, 6 theorems, 37 equations, 3 figures, 2 algorithms.

Key Result

Lemma 1

Suppose for a function $h:[0,1]^d\rightarrow \mathbb{R}$, we have two estimates $\hat{h}_1$ and $\hat{h}_2$. Suppose for some $e_1,e_2,e_2'>0$, $||h-\hat{h}_1||_\infty \leq e_1$ and $||h+\tilde{h}-\hat{h}_2||_\infty \leq e_2\leq e_1$ where function $\tilde{h}:[0,1]^d\rightarrow \mathbb{R}$ satisfies

Figures (3)

  • Figure 1: An illustration of the confidence thresholding estimator. On the left panel, the blue dashed line is $\hat{h}_1$, the green line is $\hat{h}_2$ and two red lines are the confidence upper bound $\hat{h}_1+e_1$ and lower bound $\hat{h}_1-e_1$. On the right panel, the black line is the confidence thresholding estimator $\hat{\mu}_{\rm ct}(\hat{h}_1(x),\hat{h}_2(x), e_1)$.
  • Figure 2: MSEs of different regression methods. Blue: MSE of local polynomial regression on the target domain. Red: MSE of ACT on the target domain. Green: MSE of local polynomial regression on the source domain.
  • Figure 3: MSEs of different regression methods. Red: MSE of local polynomial regression on the target domain. Green: MSE of ACT.

Theorems & Definitions (11)

  • Definition 1
  • Definition 2
  • Definition 3
  • Lemma 1
  • Remark 1
  • Theorem 1: Minimax upper bound
  • Theorem 2: Minimax lower bound
  • Theorem 3: Adaptive upper bound
  • Remark 2
  • Theorem 4: lower bound
  • ...and 1 more