Table of Contents
Fetching ...

Leave-One-Out Prediction for General Hypothesis Classes

Jian Qian, Jiachen Xu

Abstract

Leave-one-out (LOO) prediction provides a principled, data-dependent measure of generalization, yet guarantees in fully transductive settings remain poorly understood beyond specialized models. We introduce Median of Level-Set Aggregation (MLSA), a general aggregation procedure based on empirical-risk level sets around the ERM. For arbitrary fixed datasets and losses satisfying a mild monotonicity condition, we establish a multiplicative oracle inequality for the LOO error of the form \[ LOO_S(\hat{h}) \;\le\; C \cdot \frac{1}{n} \min_{h\in H} L_S(h) \;+\; \frac{Comp(S,H,\ell)}{n}, \qquad C>1. \] The analysis is based on a local level-set growth condition controlling how the set of near-optimal empirical-risk minimizers expands as the tolerance increases. We verify this condition in several canonical settings. For classification with VC classes under the 0-1 loss, the resulting complexity scales as $O(d \log n)$, where $d$ is the VC dimension. For finite hypothesis and density classes under bounded or log loss, it scales as $O(\log |H|)$ and $O(\log |P|)$, respectively. For logistic regression with bounded covariates and parameters, a volumetric argument based on the empirical covariance matrix yields complexity scaling as $O(d \log n)$ up to problem-dependent factors.

Leave-One-Out Prediction for General Hypothesis Classes

Abstract

Leave-one-out (LOO) prediction provides a principled, data-dependent measure of generalization, yet guarantees in fully transductive settings remain poorly understood beyond specialized models. We introduce Median of Level-Set Aggregation (MLSA), a general aggregation procedure based on empirical-risk level sets around the ERM. For arbitrary fixed datasets and losses satisfying a mild monotonicity condition, we establish a multiplicative oracle inequality for the LOO error of the form The analysis is based on a local level-set growth condition controlling how the set of near-optimal empirical-risk minimizers expands as the tolerance increases. We verify this condition in several canonical settings. For classification with VC classes under the 0-1 loss, the resulting complexity scales as , where is the VC dimension. For finite hypothesis and density classes under bounded or log loss, it scales as and , respectively. For logistic regression with bounded covariates and parameters, a volumetric argument based on the empirical covariance matrix yields complexity scaling as up to problem-dependent factors.
Paper Structure (25 sections, 18 theorems, 199 equations, 1 algorithm)

This paper contains 25 sections, 18 theorems, 199 equations, 1 algorithm.

Key Result

Proposition 3.1

Suppose $\mathrm{Agg}$ satisfies ass:agg and that $\mathcal{H}$ and $\ell$ satisfy ass:key with parameter $(\mu,t,\Delta,C_g)$. Then

Theorems & Definitions (39)

  • Proposition 3.1
  • proof : Proof of \ref{['prop:good-level-guarantee']}
  • Theorem 3.1
  • proof : Proof sketch of \ref{['thm:main']}
  • Lemma 4.1
  • proof : Proof sketch of \ref{['lem:local-growth-for-classification']}
  • Corollary 4.1
  • Lemma 5.1
  • Corollary 5.1
  • Lemma 6.1
  • ...and 29 more