Table of Contents
Fetching ...

Transfer Learning under Covariate Shift: Local $k$-Nearest Neighbours Regression with Heavy-Tailed Design

Petr Zamolodtchikov, Hanyuan Hang

TL;DR

A new concept called density ratio exponent is introduced to quantify the relative decay rates of marginal distributions' tails under covariate shift, and the local k-nearest neighbour regressor is proposed, which adapts the number of nearest neighbours based on the marginal likelihood of each test sample.

Abstract

Covariate shift is a common transfer learning scenario where the marginal distributions of input variables vary between source and target data while the conditional distribution of the output variable remains consistent. The existing notions describing differences between marginal distributions face limitations in handling scenarios with unbounded support, particularly when the target distribution has a heavier tail. To overcome these challenges, we introduce a new concept called density ratio exponent to quantify the relative decay rates of marginal distributions' tails under covariate shift. Furthermore, we propose the local k-nearest neighbour regressor for transfer learning, which adapts the number of nearest neighbours based on the marginal likelihood of each test sample. From a theoretical perspective, convergence rates with and without supervision information on the target domain are established. Those rates indicate that our estimator achieves faster convergence rates when the density ratio exponent satisfies certain conditions, highlighting the benefits of using density estimation for determining different numbers of nearest neighbours for each test sample. Our contributions enhance the understanding and applicability of transfer learning under covariate shift, especially in scenarios with unbounded support and heavy-tailed distributions.

Transfer Learning under Covariate Shift: Local $k$-Nearest Neighbours Regression with Heavy-Tailed Design

TL;DR

A new concept called density ratio exponent is introduced to quantify the relative decay rates of marginal distributions' tails under covariate shift, and the local k-nearest neighbour regressor is proposed, which adapts the number of nearest neighbours based on the marginal likelihood of each test sample.

Abstract

Covariate shift is a common transfer learning scenario where the marginal distributions of input variables vary between source and target data while the conditional distribution of the output variable remains consistent. The existing notions describing differences between marginal distributions face limitations in handling scenarios with unbounded support, particularly when the target distribution has a heavier tail. To overcome these challenges, we introduce a new concept called density ratio exponent to quantify the relative decay rates of marginal distributions' tails under covariate shift. Furthermore, we propose the local k-nearest neighbour regressor for transfer learning, which adapts the number of nearest neighbours based on the marginal likelihood of each test sample. From a theoretical perspective, convergence rates with and without supervision information on the target domain are established. Those rates indicate that our estimator achieves faster convergence rates when the density ratio exponent satisfies certain conditions, highlighting the benefits of using density estimation for determining different numbers of nearest neighbours for each test sample. Our contributions enhance the understanding and applicability of transfer learning under covariate shift, especially in scenarios with unbounded support and heavy-tailed distributions.
Paper Structure (10 sections, 24 theorems, 200 equations, 1 figure)

This paper contains 10 sections, 24 theorems, 200 equations, 1 figure.

Key Result

Theorem 9

Let $r_T := \gamma/(\gamma + 1) \wedge 2\beta/(2\beta + d)$ and $r_M := \rho/(2\rho + d)\wedge 2\beta/(2\beta + d).$ Consider arbitrary positive constants $\kappa_{\operatorname{P}}$ and $\kappa_{\operatorname{Q}}.$

Figures (1)

  • Figure 1: Comparison of the rates of the one-sample standard (red) and local (black) regressors. $r$ represents the source rate as a function of $\gamma$ in Theorem \ref{['th.transfer.standard.knn']} and Theorem \ref{['th.transfer.local.knn']}.

Theorems & Definitions (62)

  • Definition 2: Hölder Class of Functions
  • Definition 4: Density Ratio Exponent
  • Definition 5
  • Definition 6
  • Example 7: Exponential distributions
  • Example 8: Pareto distributions
  • Theorem 9
  • Definition 10
  • Theorem 11
  • Example 12: Rates for exponential source-target pairs
  • ...and 52 more