Robust Transfer Learning with Unreliable Source Data

Jianqing Fan; Cheng Gao; Jason M. Klusowski

Robust Transfer Learning with Unreliable Source Data

Jianqing Fan, Cheng Gao, Jason M. Klusowski

TL;DR

A novel quantity called the ''ambiguity level'' is introduced that measures the discrepancy between the target and source regression functions, and a general theorem is established that shows how this new quantity is related to the transferability of learning in terms of risk improvements.

Abstract

This paper addresses challenges in robust transfer learning stemming from ambiguity in Bayes classifiers and weak transferable signals between the target and source distribution. We introduce a novel quantity called the ''ambiguity level'' that measures the discrepancy between the target and source regression functions, propose a simple transfer learning procedure, and establish a general theorem that shows how this new quantity is related to the transferability of learning in terms of risk improvements. Our proposed ''Transfer Around Boundary'' (TAB) model, with a threshold balancing the performance of target and source data, is shown to be both efficient and robust, improving classification while avoiding negative transfer. Moreover, we demonstrate the effectiveness of the TAB model on non-parametric classification and logistic regression tasks, achieving upper bounds which are optimal up to logarithmic factors. Simulation studies lend further support to the effectiveness of TAB. We also provide simple approaches to bound the excess misclassification error without the need for specialized knowledge in transfer learning.

Robust Transfer Learning with Unreliable Source Data

TL;DR

Abstract

Paper Structure (45 sections, 26 theorems, 463 equations, 3 figures)

This paper contains 45 sections, 26 theorems, 463 equations, 3 figures.

Introduction
Related Literature
Main Contribution
Notation and Organization
Model
Problem Formulation
Source Data Ambiguity
On the Ambiguity Level
General Convergence Results
Performance of the TAB model
Simple Approach to Bounding Signal Transfer Risk
Applications in Non-parametric Classification
K-Nearest Neighbor TAB Classifier
Non-parametric Classification Setting
Optimal Rate of Excess Risk
...and 30 more sections

Key Result

Theorem 1

Let $\hat{\eta}^Q$ be an estimate of the regression function $\eta^Q$ and $\hat{f}^P$ be a classifier obtained by $\mathcal{D}_P$. Suppose there exist two sequences $\delta_Q, \delta_f$ such that $\delta_Q^{1+\alpha}\gtrsim n_Q^{-c}$ for some constant $c>0$ and $\alpha$ defined by the margin Assumpt for some constant $C_1>0$. Given the choice of $\tau\gtrsim \log(n_Q\lor n_P)\delta_Q$, the TAB cla

Figures (3)

Figure 2: Accuracy of the TAB $K$-NN classifiers under the band-like ambiguity scenario. We experiment with different values of $\Delta$ for a given $\gamma=0.5$ and $1$. Blue: TAB $K$-NN classifier; Red: $K$-NN classifier on only $Q$-data; Green: $K$-NN classifier on only $P$-data; Brown: $K$-NN classifer on pooled data.
Figure 3: Accuracy of the TAB $K$-NN classifiers under the scenario with partially flipped sine functions. We experiment with different values of the ratio parameter $r$ for a given $\gamma=0.5$ and $1$. Blue: TAB $K$-NN classifier; Red: $K$-NN classifier on only $Q$-data; Green: $K$-NN classifier on only $P$-data; Brown: $K$-NN classifer on pooled data.
Figure 4: Accuracy of the TAB logistic classifier with lasso penalty. We conduct experiments with difference choices of the angle $\Delta\in[0,\pi/2]$. Blue: TAB logistic classifiers with lasso penalty; Red: Logistic classifier with lasso penalty on only $Q$-data; Green: Logistic classifier with lasso penalty on only $P$-data; Brown: Logistic classifier with lasso penalty on pooled data.

Theorems & Definitions (61)

Definition 1: Signal Strength
Example 1: Perfect Source
Example 2: Strong Signal over $\Omega_P$
Example 3: Strong Signal with Imperfect Transfer
Example 4: Band-like Ambiguity
Definition 2: Signal Transfer Risk
Theorem 1
Theorem 2
Theorem 3
Theorem 4: Non-parametric Classification Upper Bound
...and 51 more

Robust Transfer Learning with Unreliable Source Data

TL;DR

Abstract

Robust Transfer Learning with Unreliable Source Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (61)