Transfer learning via Regularized Linear Discriminant Analysis

Hongzhe Zhang; Arnab Auddy; Hongzhe Lee

Transfer learning via Regularized Linear Discriminant Analysis

Hongzhe Zhang, Arnab Auddy, Hongzhe Lee

TL;DR

This work addresses high-dimensional two-class discrimination with scarce target data by introducing transfer-learning regularized discriminant analysis (TL-RDA). It combines ridge discriminants from target and source populations via learned weights, with asymptotic guarantees derived under random-matrix theory in the proportional regime $p/n\to\gamma$. The authors characterize limiting classification errors, provide closed-form optimal weights for estimation and prediction risks, and offer geometric intuition and robustness results. Empirical assessment, including proteomics-based cardiovascular risk prediction, shows TL-RDA and its pooled variant often outperform target-only and pool-only approaches, demonstrating practical value for leveraging related datasets in high-dimensional biomedical problems.

Abstract

Linear discriminant analysis is a widely used method for classification. However, the high dimensionality of predictors combined with small sample sizes often results in large classification errors. To address this challenge, it is crucial to leverage data from related source models to enhance the classification performance of a target model. We propose to address this problem in the framework of transfer learning. In this paper, we present novel transfer learning methods via regularized random-effects linear discriminant analysis, where the discriminant direction is estimated as a weighted combination of ridge estimates obtained from both the target and source models. Multiple strategies for determining these weights are introduced and evaluated, including one that minimizes the estimation risk of the discriminant vector and another that minimizes the classification error. Utilizing results from random matrix theory, we explicitly derive the asymptotic values of these weights and the associated classification error rates in the high-dimensional setting, where $p/n \rightarrow γ$, with $p$ representing the predictor dimension and $n$ the sample size. We also provide geometric interpretations of various weights and a guidance on which weights to choose. Extensive numerical studies, including simulations and analysis of proteomics-based 10-year cardiovascular disease risk classification, demonstrate the effectiveness of the proposed approach.

Transfer learning via Regularized Linear Discriminant Analysis

TL;DR

. The authors characterize limiting classification errors, provide closed-form optimal weights for estimation and prediction risks, and offer geometric intuition and robustness results. Empirical assessment, including proteomics-based cardiovascular risk prediction, shows TL-RDA and its pooled variant often outperform target-only and pool-only approaches, demonstrating practical value for leveraging related datasets in high-dimensional biomedical problems.

Abstract

, with

representing the predictor dimension and

the sample size. We also provide geometric interpretations of various weights and a guidance on which weights to choose. Extensive numerical studies, including simulations and analysis of proteomics-based 10-year cardiovascular disease risk classification, demonstrate the effectiveness of the proposed approach.

Paper Structure (26 sections, 27 theorems, 207 equations, 7 figures, 1 table)

This paper contains 26 sections, 27 theorems, 207 equations, 7 figures, 1 table.

Introduction
Transfer Learning via Regularized Discriminant Analysis
Random-effects LDA and the Assumptions on Population Parameters
Classification Weights
Random Matrix Assumption and Related Results
Asymptotic Analysis of Weights and Classification Errors
Classification Error
Minimum Estimation Risk Weight
Minimum Prediction Risk Weight
Geometric Interpretation
Robustness and Weight Selection
Robustness of Optimal Estimation Weight
Pooled Sample Covariance and Individual Sample Covariance Matrix
Heterogeneous Population Covariance Matrix
Proteomics-based Prediction of 10-year Cardiovascular Disease Risk
...and 11 more sections

Key Result

Theorem 4.1

Suppose that assumptions as:TCG-as:CCW as well as as:moment and as:aniso hold. Then for a fixed $K\ge 2$, as $n_k, p \rightarrow \infty, p / n_k \rightarrow \gamma_k\in (0, \infty]$ for $1\le k\le K$, we have that the limiting form of $Err(\mathbf{w})$ for a given weight vector $\mathbf{w}\in \mathb where the elements of $\mathbf{u}\in \mathbb{R}^K$ and ${\cal A}\in \mathbb{R}^{K\times K}$ are as

Figures (7)

Figure 1: Geometric interpretations of optimal weights
Figure 2: TL-RDA with the optimal estimation weight outperforms reugular RDA and TL-RDA with the optimal prediction weight when the target distribution changes.
Figure 3: TLP-RDA outperforms TL-RDA when $\gamma$ is large under general set ups.
Figure 4: Individual sample covariance matrices
Figure 5: Pooled sample covariance matrices
...and 2 more figures

Theorems & Definitions (48)

Theorem 4.1: Asymptotic Classification Error for TL-RDA
Remark 4.1
Corollary 4.2: Asymptotic Classification Error for TLP-RDA
Theorem 4.3: Asymptotic Estimation Error Minimization for TL-RDA
Corollary 4.4: Estimation Error for Homogeneous Sources
Remark 4.2: Estimating optimal weights
Proposition 4.5
Corollary 4.6: Asymptotic Estimation Error Minimization for TLP-RDA
Theorem 4.7: Asymptotic Prediction Error Minimization for TL-RDA
Proposition 4.8
...and 38 more

Transfer learning via Regularized Linear Discriminant Analysis

TL;DR

Abstract

Transfer learning via Regularized Linear Discriminant Analysis

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (48)