Table of Contents
Fetching ...

Simultaneous analysis of approximate leave-one-out cross-validation and mean-field inference

Pierre C Bellec

TL;DR

This work addresses estimating the generalization error of high-dimensional regularized estimators in the proportional regime by unifying Approximate Leave-One-Out Cross-Validation (ALO-CV) and mean-field inference through a conditioning-based argument that accommodates non-differentiable penalties under Gaussian design. It shows that ALO-CV is consistent with the leave-one-out quantity and that its data-dependent weights $W_i$ concentrate around mean-field corrections such as $\mathrm{tr}[\Sigma \hat{A}]$, thereby reconciling the ALO-CV view with mean-field predictions. The paper provides explicit forms for the key matrix $\hat{A}$ in common penalties, probabilistic concentration results beyond rotational invariance, and two conditioning examples—robust linear regression and single-index models—where polylogarithmic rates yield practical LOO-consistency results. Overall, the work offers a rigorous bridge between a computationally efficient ALO-CV approach and mean-field theoretical descriptions, enhancing the reliability of generalization-error estimates in high-dimensional settings.

Abstract

Approximate Leave-One-Out Cross-Validation (ALO-CV) is a method that has been proposed to estimate the generalization error of a regularized estimator in the high-dimensional regime where dimension and sample size are of the same order, the so called ``proportional regime''. A new analysis is developed to derive the consistency of ALO-CV for non-differentiable regularizer under Gaussian covariates and strong-convexity of the regularizer. Using a conditioning argument, the difference between the ALO-CV weights and their counterparts in mean-field inference is shown to be small. Combined with upper bounds between the mean-field inference estimate and the leave-one-out quantity, this provides a proof that ALO-CV approximates the leave-one-out quantity as well up to negligible error terms. Linear models with square loss, robust linear regression and single-index models are explicitly treated.

Simultaneous analysis of approximate leave-one-out cross-validation and mean-field inference

TL;DR

This work addresses estimating the generalization error of high-dimensional regularized estimators in the proportional regime by unifying Approximate Leave-One-Out Cross-Validation (ALO-CV) and mean-field inference through a conditioning-based argument that accommodates non-differentiable penalties under Gaussian design. It shows that ALO-CV is consistent with the leave-one-out quantity and that its data-dependent weights concentrate around mean-field corrections such as , thereby reconciling the ALO-CV view with mean-field predictions. The paper provides explicit forms for the key matrix in common penalties, probabilistic concentration results beyond rotational invariance, and two conditioning examples—robust linear regression and single-index models—where polylogarithmic rates yield practical LOO-consistency results. Overall, the work offers a rigorous bridge between a computationally efficient ALO-CV approach and mean-field theoretical descriptions, enhancing the reliability of generalization-error estimates in high-dimensional settings.

Abstract

Approximate Leave-One-Out Cross-Validation (ALO-CV) is a method that has been proposed to estimate the generalization error of a regularized estimator in the high-dimensional regime where dimension and sample size are of the same order, the so called ``proportional regime''. A new analysis is developed to derive the consistency of ALO-CV for non-differentiable regularizer under Gaussian covariates and strong-convexity of the regularizer. Using a conditioning argument, the difference between the ALO-CV weights and their counterparts in mean-field inference is shown to be small. Combined with upper bounds between the mean-field inference estimate and the leave-one-out quantity, this provides a proof that ALO-CV approximates the leave-one-out quantity as well up to negligible error terms. Linear models with square loss, robust linear regression and single-index models are explicitly treated.
Paper Structure (13 sections, 7 theorems, 106 equations)

This paper contains 13 sections, 7 theorems, 106 equations.

Key Result

Proposition 1.1

Consider iid $(x_i,y_i)_{i\in[n]}$ with $x_i\sim N(0, \Sigma)$, and $w\in\mathbb{R}^p$ with $\mathbb{E}[(w^Tx_i)^2]\in\{0, 1\}$ such that $(I_p - \Sigma ww^T)x_i$ is independent of $(x_i^Tw, y_i)$. Assume that for all value of $y\in\mathcal{Y}$, the loss $L_y(\cdot)$ is differentiable and 1-Lipschit is almost everywhere differentiable with $\frac{\partial}{\partial x_{ij}}\hat{b} = \hat{A}(-e_j L_

Theorems & Definitions (15)

  • Proposition 1.1
  • proof : Proof of \ref{['prop:mean_field']}
  • Example 1.1
  • Example 1.2
  • Example 1.3: group-lasso
  • Proposition 2.1
  • proof : Proof of \ref{['prop:square_loss']}
  • Theorem 3.1
  • Corollary 3.2
  • Corollary 4.1
  • ...and 5 more