An adaptive transfer learning perspective on classification in non-stationary environments
Henry W J Reeve
TL;DR
This work develops an adaptive transfer-learning framework for binary classification under non-stationary label-shift by decoupling estimation of the transformed density ratio $\eta$ from estimation of the current label probability $\pi_n$. It introduces a Legendre-polynomial-based estimator for $\pi_n$ and a local-confidence-bound approach for $\eta$, enabling a fully adaptive plug-in classifier $\hat{\varphi}_n$ without prior knowledge of the dynamics. The authors prove high-probability bounds for single-time-step excess risk and derive average dynamic-regret bounds that accommodate smooth evolution and occasional jumps in label marginals, expressed in terms of $\mathbb{V}(\pi)$ or related variation metrics. The results unify adaptive statistical estimation with sequential decision guarantees, highlighting the merit of an estimation-centered perspective in non-stationary transfer learning. Overall, the paper provides concrete, distribution-dependent error controls and shows that adaptive estimation can match or approach online-dynamic strategies in non-stationary environments.
Abstract
We consider a semi-supervised classification problem with non-stationary label-shift in which we observe a labelled data set followed by a sequence of unlabelled covariate vectors in which the marginal probabilities of the class labels may change over time. Our objective is to predict the corresponding class-label for each covariate vector, without ever observing the ground-truth labels, beyond the initial labelled data set. Previous work has demonstrated the potential of sophisticated variants of online gradient descent to perform competitively with the optimal dynamic strategy (Bai et al. 2022). In this work we explore an alternative approach grounded in statistical methods for adaptive transfer learning. We demonstrate the merits of this alternative methodology by establishing a high-probability regret bound on the test error at any given individual test-time, which adapt automatically to the unknown dynamics of the marginal label probabilities. Further more, we give bounds on the average dynamic regret which match the average guarantees of the online learning perspective for any given time interval.
