Table of Contents
Fetching ...

Adapting to Continuous Covariate Shift via Online Density Ratio Estimation

Yu-Jie Zhang, Zhen-Yu Zhang, Peng Zhao, Masashi Sugiyama

TL;DR

This work tackles continuous covariate shift, where test distributions vary sequentially while the conditional label distribution remains fixed. The authors extend importance-weighted ERM (IWERM) by introducing online density-ratio estimation (DRE) with dynamic regret guarantees and an online ensemble to balance historical data reuse. A concrete instantiation using logistic-regression-based DRE yields a high-probability dynamic regret bound of $\widetilde{O}(T^{1/3}V_T^{2/3})$ and an averaged excess risk bound of $\widetilde{O}(T^{-1/3}V_T^{1/3})$, and experiments demonstrate superior performance over standard covariate-shift baselines, especially under rapidly changing environments. The approach decouples density-ratio estimation from predictor training, enabling versatile deployment across models and suggesting broad applicability to non-stationary distribution shifts in real-world settings.

Abstract

Dealing with distribution shifts is one of the central challenges for modern machine learning. One fundamental situation is the covariate shift, where the input distributions of data change from training to testing stages while the input-conditional output distribution remains unchanged. In this paper, we initiate the study of a more challenging scenario -- continuous covariate shift -- in which the test data appear sequentially, and their distributions can shift continuously. Our goal is to adaptively train the predictor such that its prediction risk accumulated over time can be minimized. Starting with the importance-weighted learning, we show the method works effectively if the time-varying density ratios of test and train inputs can be accurately estimated. However, existing density ratio estimation methods would fail due to data scarcity at each time step. To this end, we propose an online method that can appropriately reuse historical information. Our density ratio estimation method is proven to perform well by enjoying a dynamic regret bound, which finally leads to an excess risk guarantee for the predictor. Empirical results also validate the effectiveness.

Adapting to Continuous Covariate Shift via Online Density Ratio Estimation

TL;DR

This work tackles continuous covariate shift, where test distributions vary sequentially while the conditional label distribution remains fixed. The authors extend importance-weighted ERM (IWERM) by introducing online density-ratio estimation (DRE) with dynamic regret guarantees and an online ensemble to balance historical data reuse. A concrete instantiation using logistic-regression-based DRE yields a high-probability dynamic regret bound of and an averaged excess risk bound of , and experiments demonstrate superior performance over standard covariate-shift baselines, especially under rapidly changing environments. The approach decouples density-ratio estimation from predictor training, enabling versatile deployment across models and suggesting broad applicability to non-stationary distribution shifts in real-world settings.

Abstract

Dealing with distribution shifts is one of the central challenges for modern machine learning. One fundamental situation is the covariate shift, where the input distributions of data change from training to testing stages while the input-conditional output distribution remains unchanged. In this paper, we initiate the study of a more challenging scenario -- continuous covariate shift -- in which the test data appear sequentially, and their distributions can shift continuously. Our goal is to adaptively train the predictor such that its prediction risk accumulated over time can be minimized. Starting with the importance-weighted learning, we show the method works effectively if the time-varying density ratios of test and train inputs can be accurately estimated. However, existing density ratio estimation methods would fail due to data scarcity at each time step. To this end, we propose an online method that can appropriately reuse historical information. Our density ratio estimation method is proven to perform well by enjoying a dynamic regret bound, which finally leads to an excess risk guarantee for the predictor. Empirical results also validate the effectiveness.
Paper Structure (38 sections, 24 theorems, 131 equations, 5 figures, 3 tables, 2 algorithms)

This paper contains 38 sections, 24 theorems, 131 equations, 5 figures, 3 tables, 2 algorithms.

Key Result

Proposition 1

For any $\delta\in(0,1]$, with probability at least $1-\delta$, IWERM eq:IWERM-continuous with the estimator $\widehat{r}_t(\mathbf{x})$ ensures $\mathfrak{R}_T(\{\widehat{\mathbf{w}}_t\}_{t=1}^T)\leq 2\sum_{t=1}^T\mathbb{E}_{\mathbf{x}\sim S_0}[\vert \widehat{r}_t(\mathbf{x}) - r^*_t(\mathbf{x})\ve

Figures (5)

  • Figure 1: An illustration of our online ensemble method, where we employ a meta-learner to aggregate the predictions from base-learners running over different intervals of the time horizon.
  • Figure 2: Performance comparison on the four kinds of shifts.
  • Figure 3: Weight(%) heatmap of base-learners in squared shift.
  • Figure 4: Average error and estimator loss in squared shift.
  • Figure 5: Average error on Yearbook dataset with real-life covariate shift.

Theorems & Definitions (47)

  • Proposition 1
  • Proposition 2
  • Theorem 1
  • Remark 1: assumptions on $\psi$
  • Remark 2: comparison with previous work
  • Example 1: Logistic regression model
  • Theorem 2
  • Theorem 3
  • proof : Proof of Proposition \ref{['lem:IWERM-continuous']}
  • proof : Proof of Proposition \ref{['lemma:BD-conversion']}
  • ...and 37 more