Distribution-Free Sequential Prediction with Abstentions

Jialin Yu; Moïse Blanchard

Distribution-Free Sequential Prediction with Abstentions

Jialin Yu, Moïse Blanchard

TL;DR

An algorithm is proposed, based on a boosting procedure of weak learners, which guarantees sublinear error for general VC classes in a distribution-free setting and abstention learning for oblivious adversaries, and also enjoys similar guarantees for adaptive adversaries.

Abstract

We study a sequential prediction problem in which an adversary is allowed to inject arbitrarily many adversarial instances in a stream of i.i.d.\ instances, but at each round, the learner may also \emph{abstain} from making a prediction without incurring any penalty if the instance was indeed corrupted. This semi-adversarial setting naturally sits between the classical stochastic case with i.i.d.\ instances for which function classes with finite VC dimension are learnable; and the adversarial case with arbitrary instances, known to be significantly more restrictive. For this problem, Goel et al. (2023) showed that, if the learner knows the distribution $μ$ of clean samples in advance, learning can be achieved for all VC classes without restrictions on adversary corruptions. This is, however, a strong assumption in both theory and practice: a natural question is whether similar learning guarantees can be achieved without prior distributional knowledge, as is standard in classical learning frameworks (e.g., PAC learning or asymptotic consistency) and other non-i.i.d.\ models (e.g., smoothed online learning). We therefore focus on the distribution-free setting where $μ$ is \emph{unknown} and propose an algorithm \textsc{AbstainBoost} based on a boosting procedure of weak learners, which guarantees sublinear error for general VC classes in \emph{distribution-free} abstention learning for oblivious adversaries. These algorithms also enjoy similar guarantees for adaptive adversaries, for structured function classes including linear classifiers. These results are complemented with corresponding lower bounds, which reveal an interesting polynomial trade-off between misclassification error and number of erroneous abstentions.

Distribution-Free Sequential Prediction with Abstentions

TL;DR

Abstract

of clean samples in advance, learning can be achieved for all VC classes without restrictions on adversary corruptions. This is, however, a strong assumption in both theory and practice: a natural question is whether similar learning guarantees can be achieved without prior distributional knowledge, as is standard in classical learning frameworks (e.g., PAC learning or asymptotic consistency) and other non-i.i.d.\ models (e.g., smoothed online learning). We therefore focus on the distribution-free setting where

is \emph{unknown} and propose an algorithm \textsc{AbstainBoost} based on a boosting procedure of weak learners, which guarantees sublinear error for general VC classes in \emph{distribution-free} abstention learning for oblivious adversaries. These algorithms also enjoy similar guarantees for adaptive adversaries, for structured function classes including linear classifiers. These results are complemented with corresponding lower bounds, which reveal an interesting polynomial trade-off between misclassification error and number of erroneous abstentions.

Paper Structure (45 sections, 19 theorems, 117 equations, 2 figures, 7 algorithms)

This paper contains 45 sections, 19 theorems, 117 equations, 2 figures, 7 algorithms.

Introduction
Distribution-free sequential learning with abstentions.
Our contributions.
Organization.
Preliminaries
Abstention learning setup.
Oblivious and adaptive adversaries.
Complexity notion for the function class.
Further notation.
Main Results
Open question.
Overview of our algorithmic approach
Construction of weak learners.
Boosting strategy to combine weak learners.
Abstention learning with few uncorrupted samples
...and 30 more sections

Key Result

Theorem 2

Let $\mathcal{F}$ be a function class with VC dimension $d$ and a horizon $T\geq 1$. Then, for any $\alpha\in[0,1/4]$, there is a choice of parameters for AbstainBoost that achieves the following learning guarantee against any oblivious adversary:

Figures (2)

Figure 1: Tradeoffs between misclassification error and abstention error for either (1) oblivious adversaries or (2) adaptive adversaries and function classes with finite reduction dimension (see Definition \ref{['def:reduction_dimension']}), including linear classifiers. The green region 1 is achievable (\ref{['thm:main_oblivious_upper_bound', 'thm:main_adaptive_upper_bound']}), while the red region 2 is not (\ref{['thm:lowerbound']}). The plot is displayed in log-scale.
Figure 2: Tradeoffs between abstention error and misclassification error for VC-1 function classes and axis-aligned rectangles. The upper bound (region 1) from goel2023adversarial applies to general adaptive adversaries, and the lower bound (region 2) shares the same lower bound result (\ref{['thm:lowerbound']}) with \ref{['fig:tradeoff results']}. The plot is displayed in log-scale.

Theorems & Definitions (26)

Remark 1
Definition 1: Oblivious and adaptive adversaries
Definition 2: VC dimension
Theorem 2
Theorem 3
Remark 4
Definition 3
Theorem 5
Theorem 6
Theorem 7
...and 16 more

Distribution-Free Sequential Prediction with Abstentions

TL;DR

Abstract

Distribution-Free Sequential Prediction with Abstentions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (26)