An adaptive transfer learning perspective on classification in non-stationary environments

Henry W J Reeve

An adaptive transfer learning perspective on classification in non-stationary environments

Henry W J Reeve

TL;DR

This work develops an adaptive transfer-learning framework for binary classification under non-stationary label-shift by decoupling estimation of the transformed density ratio $\eta$ from estimation of the current label probability $\pi_n$. It introduces a Legendre-polynomial-based estimator for $\pi_n$ and a local-confidence-bound approach for $\eta$, enabling a fully adaptive plug-in classifier $\hat{\varphi}_n$ without prior knowledge of the dynamics. The authors prove high-probability bounds for single-time-step excess risk and derive average dynamic-regret bounds that accommodate smooth evolution and occasional jumps in label marginals, expressed in terms of $\mathbb{V}(\pi)$ or related variation metrics. The results unify adaptive statistical estimation with sequential decision guarantees, highlighting the merit of an estimation-centered perspective in non-stationary transfer learning. Overall, the paper provides concrete, distribution-dependent error controls and shows that adaptive estimation can match or approach online-dynamic strategies in non-stationary environments.

Abstract

We consider a semi-supervised classification problem with non-stationary label-shift in which we observe a labelled data set followed by a sequence of unlabelled covariate vectors in which the marginal probabilities of the class labels may change over time. Our objective is to predict the corresponding class-label for each covariate vector, without ever observing the ground-truth labels, beyond the initial labelled data set. Previous work has demonstrated the potential of sophisticated variants of online gradient descent to perform competitively with the optimal dynamic strategy (Bai et al. 2022). In this work we explore an alternative approach grounded in statistical methods for adaptive transfer learning. We demonstrate the merits of this alternative methodology by establishing a high-probability regret bound on the test error at any given individual test-time, which adapt automatically to the unknown dynamics of the marginal label probabilities. Further more, we give bounds on the average dynamic regret which match the average guarantees of the online learning perspective for any given time interval.

An adaptive transfer learning perspective on classification in non-stationary environments

TL;DR

This work develops an adaptive transfer-learning framework for binary classification under non-stationary label-shift by decoupling estimation of the transformed density ratio

from estimation of the current label probability

. It introduces a Legendre-polynomial-based estimator for

and a local-confidence-bound approach for

, enabling a fully adaptive plug-in classifier

without prior knowledge of the dynamics. The authors prove high-probability bounds for single-time-step excess risk and derive average dynamic-regret bounds that accommodate smooth evolution and occasional jumps in label marginals, expressed in terms of

or related variation metrics. The results unify adaptive statistical estimation with sequential decision guarantees, highlighting the merit of an estimation-centered perspective in non-stationary transfer learning. Overall, the paper provides concrete, distribution-dependent error controls and shows that adaptive estimation can match or approach online-dynamic strategies in non-stationary environments.

Abstract

Paper Structure (18 sections, 26 theorems, 166 equations)

This paper contains 18 sections, 26 theorems, 166 equations.

Introduction
Statistical setting
Methodology
Estimating the transformed density ratio
Estimating the label probability
A bound on the simple regret
Bounds on the average dynamic regret
The proof structure for the regret bounds
The transformed density ratio estimator
The label probability estimator
A modular regret bound
Completing the error bounds
Confidence intervals for the class-conditional distributions
Estimating the transformed density ratio
Estimating the label probabilities
...and 3 more sections

Key Result

Lemma 1

Suppose $\varphi:\mathcal{X}^{2n_{0}+2n_{1}+{n}}\rightarrow \{0,1\}$ is a classifier. Then, Hence, letting $\varphi_{{n}}^\star:\mathcal{X} \rightarrow \{0,1\}$ be the classifier defined by $\varphi_{{n}}^\star(x):=\mathbbm{1}{\left\lbrace \eta(x)> 1 -\pi_{{n}} \right\rbrace}$ for $x \in \mathcal{X}$ we have

Theorems & Definitions (52)

Lemma 1
Theorem 1
Theorem 2
Corollary 3
Proposition 4
Proposition 5
Corollary 6
Proposition 7
Lemma 2
proof
...and 42 more

An adaptive transfer learning perspective on classification in non-stationary environments

TL;DR

Abstract

An adaptive transfer learning perspective on classification in non-stationary environments

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (52)