Transfer Learning with Large-Scale Quantile Regression

Jun Jin; Jun Yan; Robert H. Aseltine; Kun Chen

Transfer Learning with Large-Scale Quantile Regression

Jun Jin, Jun Yan, Robert H. Aseltine, Kun Chen

TL;DR

This work develops transfer learning methods for high-dimensional quantile regression by detecting informative sources whose models are similar to the target and using them to improve the target model and shows the benefits and insights gained from transfer learning of three different types of airplanes.

Abstract

Quantile regression is increasingly encountered in modern big data applications due to its robustness and flexibility. We consider the scenario of learning the conditional quantiles of a specific target population when the available data may go beyond the target and be supplemented from other sources that possibly share similarities with the target. A crucial question is how to properly distinguish and utilize useful information from other sources to improve the quantile estimation and inference at the target. We develop transfer learning methods for high-dimensional quantile regression by detecting informative sources whose models are similar to the target and utilizing them to improve the target model. We show that under reasonable conditions, the detection of the informative sources based on sample splitting is consistent. Compared to the naive estimator with only the target data, the transfer learning estimator achieves a much lower error rate as a function of the sample sizes, the signal-to-noise ratios, and the similarity measures among the target and the source models. Extensive simulation studies demonstrate the superiority of our proposed approach. We apply our methods to tackle the problem of detecting hard-landing risk for flight safety and show the benefits and insights gained from transfer learning of three different types of airplanes: Boeing 737, Airbus A320, and Airbus A380.

Transfer Learning with Large-Scale Quantile Regression

TL;DR

Abstract

Paper Structure (17 sections, 8 theorems, 105 equations, 15 figures)

This paper contains 17 sections, 8 theorems, 105 equations, 15 figures.

Introduction
Transfer Learning Setup for Quantile Regression
Estimation Procedures via Transfer Learning
Oracle Procedure with Known Informative Set
Identification of Informative Set
Theoretical Analysis
Simulation Studies
Homogeneous Design
Heterogeneous Designs
Application on Airplane Hard Landing
Discussion
Proofs
Proofs of Theorem \ref{['thm:oracle']}
Proofs of Theorem \ref{['thm:detect']}
Additional Results for Simulations
...and 2 more sections

Key Result

Theorem 4.1

Suppose Assumptions assum:A1--assum:A6 hold true. We take ${\lambda_1} \asymp \sqrt {\log ( {p \vee {\overline{n}_{{\cal A}}}} )/( {{n_{\cal A}} + {n_0}} )}$ and $\lambda_0 \asymp \lambda _2 \asymp \sqrt {\log ( {p \vee {n_0}} )/{n_0}}$, then with we have for some positive constants $C_1$,…, $C_5$, where ${\underline{s}_{{\cal A}} } = {\min _{k \in {{\cal A} \cup \{ 0 \}}}}{s_k}$.

Figures (15)

Figure 1: Schematic of the procedures in transfer learning quantile regression.
Figure 2: Schematic of the procedures in informative set detection.
Figure 3: Average quantile loss among different $d$ with $\eta=20$ in homogeneous setting for normal and Cauchy error.
Figure 4: Average MSE on $\boldsymbol{\beta}$ among different $d$ with $\eta=20$ in homogeneous setting for normal and Cauchy error.
Figure 5: Average running time comparison under $n = 10000$ with different $p$
...and 10 more figures

Theorems & Definitions (12)

Theorem 4.1: Convergence rate of the oracle estimator
Theorem 4.2: Consistency of $\widehat{\cal A}$
Lemma A.1: Proposition 11 in chen2020distributed
Remark
Lemma A.2: Theorem 1 in raskutti2010restricted
Lemma A.3: Bernstein's inequality (Theorem 1.13 of rigollet2015high)
Lemma A.4
proof : Proof of Lemma \ref{['lem-4']}
Lemma A.5
proof : Proof of Lemma \ref{['lem-5']}
...and 2 more

Transfer Learning with Large-Scale Quantile Regression

TL;DR

Abstract

Transfer Learning with Large-Scale Quantile Regression

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (12)