Table of Contents
Fetching ...

Transfer Learning for Moderate-Dimensional Ridge-Regularized Robust Linear Regression

Lingfeng Lyu, Xiao Guo, Zongqi Liu

Abstract

This paper studies transfer learning for ridge-regularized robust linear regression in the moderate-dimensional regime, where the number of predictors is of the same order as the sample size and the regression coefficients are not assumed to be sparse. We propose Trans-RR, which combines a robust ridge estimator from a source study with a robust ridge correction based on the target study. Under mild assumptions, we characterize the asymptotic estimation error of the proposed estimator and show that leveraging source data can substantially improve estimation accuracy relative to the traditional single-study ridge-regularized robust estimator. Simulation results and a real-data analysis support the theory and illustrate both positive and negative transfer as the discrepancy between the source and target studies varies.

Transfer Learning for Moderate-Dimensional Ridge-Regularized Robust Linear Regression

Abstract

This paper studies transfer learning for ridge-regularized robust linear regression in the moderate-dimensional regime, where the number of predictors is of the same order as the sample size and the regression coefficients are not assumed to be sparse. We propose Trans-RR, which combines a robust ridge estimator from a source study with a robust ridge correction based on the target study. Under mild assumptions, we characterize the asymptotic estimation error of the proposed estimator and show that leveraging source data can substantially improve estimation accuracy relative to the traditional single-study ridge-regularized robust estimator. Simulation results and a real-data analysis support the theory and illustrate both positive and negative transfer as the discrepancy between the source and target studies varies.

Paper Structure

This paper contains 13 sections, 3 theorems, 19 equations, 3 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

Under Assumption ass:target and Assumption ass:source, conditional on the source-stage estimator $\widehat{\boldsymbol{w}}$, which is independent of the target sample, we have $\|\widehat{\boldsymbol{\beta}} - \boldsymbol{\beta}_0\| \to r_\rho(\kappa)$ in probability, where $r_\rho(\kappa)$ is deter $\blacktriangleleft$$\blacktriangleleft$

Figures (3)

  • Figure 1: Boxplot of $\| \widehat{\boldsymbol{\beta}} - \boldsymbol{\beta}_0\|^2$ over $1000$ simulations. The red point in each boxplot represents the theoretical value $r_\rho^2$ from Theorem \ref{['thm:step_2']}. Panels from top to bottom are for $\kappa = 1, 4$, respectively, while panels from left to right are for cases $\mathrm{I}, \mathrm{II}, \mathrm{III}$, respectively.
  • Figure 2: Theoretical curves of $r_\rho$ as a function of $\|\boldsymbol{\beta}_0-\widehat{\boldsymbol{w}}\|$ for five values of $\tau$ under cases $\mathrm{I}$--$\mathrm{III}$, obtained by numerically solving Corollary \ref{['thm:step_2_s']}. The three panels correspond to cases $\mathrm{I}$, $\mathrm{II}$, and $\mathrm{III}$, respectively.
  • Figure 3: Boxplots of relative estimation errors (log scale) across $1000$ replications for varying $\|\boldsymbol{\beta}_0 - \boldsymbol{w}_0\|$ under cases $\mathrm{I}$--$\mathrm{III}$, with $p=400$ and $n=400$. Case $\mathrm{I}$ includes all five methods, while cases $\mathrm{II}$ and $\mathrm{III}$ include only the three ridge-type procedures. Panels from top to bottom are for cases $\mathrm{I}, \mathrm{II}, \mathrm{III}$, respectively.

Theorems & Definitions (4)

  • Remark 1
  • Theorem 1
  • Corollary 1
  • Corollary 2: el2018impact