Unified Transfer Learning Models in High-Dimensional Linear Regression

Shuo Shuo Liu

Unified Transfer Learning Models in High-Dimensional Linear Regression

Shuo Shuo Liu

TL;DR

This work introduces a unified transfer learning framework (UTrans) for high-dimensional linear regression that leverages multiple source datasets while explicitly identifying transferable variables and sources. When the transferring set is known, the -UTrans model constructs a penalized loss with a flexible penalty P_ppa and exposes contrasts eta_k - eta_0, yielding improved nonasymptotic estimation and prediction bounds over target-only methods. It further develops a source-detection extension (UTrans) based on a high-dimensional U-test to decide which sources and variables are transferable, enabling automatic exclusion of nontransferable data with asymptotic control of error. Empirical results on simulations and a US intergenerational mobility dataset show substantially lower estimation and prediction errors compared to existing methods, along with improved interpretability of the learned coefficients and transfer structure.

Abstract

Transfer learning plays a key role in modern data analysis when: (1) the target data are scarce but the source data are sufficient; (2) the distributions of the source and target data are heterogeneous. This paper develops an interpretable unified transfer learning model, termed as UTrans, which can detect both transferable variables and source data. More specifically, we establish the estimation error bounds and prove that our bounds are lower than those with target data only. Besides, we propose a source detection algorithm based on hypothesis testing to exclude the nontransferable data. We evaluate and compare UTrans to the existing algorithms in multiple experiments. It is shown that UTrans attains much lower estimation and prediction errors than the existing methods, while preserving interpretability. We finally apply it to the US intergenerational mobility data and compare our proposed algorithms to the classical machine learning algorithms.

Unified Transfer Learning Models in High-Dimensional Linear Regression

TL;DR

Abstract

Paper Structure (13 sections, 6 theorems, 61 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 13 sections, 6 theorems, 61 equations, 6 figures, 1 table, 1 algorithm.

Introduction
High-dimensional transfer learning models
Unified Transfer Learning Models
$\mathcal{A}$-UTrans: transfer learning with known $\mathcal{A}$
Theoretical properties of $\mathcal{A}$-UTrans
UTrans: Transfer Learning with Source Detection
Experiments
Simulation with known $\mathcal{A}$
Simulation with source detection
Intergenerational Mobility Data
Data description
Predictive analysis
Broader Impact

Key Result

Theorem 1

Let $\widehat{\boldsymbol{\mathbf{\Sigma}}}=\boldsymbol{\mathbf{X}}^\top\boldsymbol{\mathbf{X}}/n$ be the sample covariance matrix of $\boldsymbol{\mathbf{X}}$. With the RSC conditions on each $\widehat{\boldsymbol{\mathbf{\Sigma}}}_k$, we have for $\widehat{\boldsymbol{\mathbf{\Delta}}}=\widehat{\boldsymbol{\mathbf{\beta}}}-\boldsymbol{\mathbf{\beta}} \in \mathbb{R}^{p^*}$, where $v'=\min_k v_k

Figures (6)

Figure 1: Mean squared prediction errors (MSPEs) of the proposed unified model and the existing transfer learning models with different settings of $p$ (row A), $n_0$ (row B), and $h$ (row C) for each $k=1,\cdots,K$. Shade areas are calculated by $\text{MSPE}\pm 0.1\times \text{standard deviation (SD)}$.
Figure 2: The averaged $\ell_2$ estimation errors of naive-Lasso, $\mathcal{A}$-Trans-GLM, $\mathcal{A}$-UTrans-Lasso, and $\mathcal{A}$-UTrans-SCAD with different settings. Shade areas are calculated by $\text{estimate}\pm \text{SD}$.
Figure 3: The averaged $\ell_2$ estimation errors of naive-Lasso, Trans-GLM, and UTrans with different settings. Shade areas are calculated by $\text{estimate}\pm \text{SD}$.
Figure 4: Mean squared prediction errors of the proposed unified model and the existing transfer learning models with different settings of $n_0$ and $h$. Shade areas are calculated by $\text{MSPE}\pm 0.1\times \text{SD}$.
Figure 5: Mean squared prediction errors (MSPEs) of the proposed unified model and the existing transfer learning models with the compound symmetry (row A), $t$-distribution (row B), and Gaussian mixture model (row C) for each $k=1,\cdots,K$. Shade areas are calculated by $\text{MSPE}\pm 0.1\times \text{SD}$.
...and 1 more figures

Theorems & Definitions (6)

Theorem 1
Theorem 2: $\ell_1/\ell_2$ estimation error bounds of $\mathcal{A}$-UTrans
Theorem 3: Prediction error bound of $\mathcal{A}$-UTrans
Theorem 4
Lemma 5: Proposition 5.16 vershynin2010introduction
Lemma 6: Lemmas 4(b) and 5 of Loh2015

Unified Transfer Learning Models in High-Dimensional Linear Regression

TL;DR

Abstract

Unified Transfer Learning Models in High-Dimensional Linear Regression

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (6)