Leave-One-Out Prediction for General Hypothesis Classes

Jian Qian; Jiachen Xu

Leave-One-Out Prediction for General Hypothesis Classes

Jian Qian, Jiachen Xu

Abstract

Leave-one-out (LOO) prediction provides a principled, data-dependent measure of generalization, yet guarantees in fully transductive settings remain poorly understood beyond specialized models. We introduce Median of Level-Set Aggregation (MLSA), a general aggregation procedure based on empirical-risk level sets around the ERM. For arbitrary fixed datasets and losses satisfying a mild monotonicity condition, we establish a multiplicative oracle inequality for the LOO error of the form \[ LOO_S(\hat{h}) \;\le\; C \cdot \frac{1}{n} \min_{h\in H} L_S(h) \;+\; \frac{Comp(S,H,\ell)}{n}, \qquad C>1. \] The analysis is based on a local level-set growth condition controlling how the set of near-optimal empirical-risk minimizers expands as the tolerance increases. We verify this condition in several canonical settings. For classification with VC classes under the 0-1 loss, the resulting complexity scales as $O(d \log n)$, where $d$ is the VC dimension. For finite hypothesis and density classes under bounded or log loss, it scales as $O(\log |H|)$ and $O(\log |P|)$, respectively. For logistic regression with bounded covariates and parameters, a volumetric argument based on the empirical covariance matrix yields complexity scaling as $O(d \log n)$ up to problem-dependent factors.

Leave-One-Out Prediction for General Hypothesis Classes

Abstract

The analysis is based on a local level-set growth condition controlling how the set of near-optimal empirical-risk minimizers expands as the tolerance increases. We verify this condition in several canonical settings. For classification with VC classes under the 0-1 loss, the resulting complexity scales as

, where

is the VC dimension. For finite hypothesis and density classes under bounded or log loss, it scales as

and

, respectively. For logistic regression with bounded covariates and parameters, a volumetric argument based on the empirical covariance matrix yields complexity scaling as

up to problem-dependent factors.

Paper Structure (25 sections, 18 theorems, 199 equations, 1 algorithm)

This paper contains 25 sections, 18 theorems, 199 equations, 1 algorithm.

Introduction
Contributions.
Organization.
Problem Setup
Notation
Transductive Leave-One-Out Prediction
Oracle Inequalities for LOO Prediction
Median of Level-Set Aggregation
Local Level-Set Growth
Median Aggregation over Tolerance Levels
Application to Classification with $0$-$1$ Loss
Application to Regression with Convex Loss
Application to Density Estimation with Log Loss
Removing Boundedness Assumptions by Smoothing
Application to Logistic Regression
...and 10 more sections

Key Result

Proposition 3.1

Suppose $\mathrm{Agg}$ satisfies ass:agg and that $\mathcal{H}$ and $\ell$ satisfy ass:key with parameter $(\mu,t,\Delta,C_g)$. Then

Theorems & Definitions (39)

Proposition 3.1
proof : Proof of \ref{['prop:good-level-guarantee']}
Theorem 3.1
proof : Proof sketch of \ref{['thm:main']}
Lemma 4.1
proof : Proof sketch of \ref{['lem:local-growth-for-classification']}
Corollary 4.1
Lemma 5.1
Corollary 5.1
Lemma 6.1
...and 29 more

Leave-One-Out Prediction for General Hypothesis Classes

Abstract

Leave-One-Out Prediction for General Hypothesis Classes

Authors

Abstract

Table of Contents

Key Result

Theorems & Definitions (39)