Table of Contents
Fetching ...

Weighted Leave-One-Out Cross Validation

Luc Pronzato, Maria-João Rendas

TL;DR

The paper addresses estimating the Integrated Squared Prediction Error $ISE(\eta_n)=\int_{\mathscr X} \varepsilon_n^2(\mathbf{x})\,\mu(d\mathbf{x})$ for a predictor that is linear in observed GP data, by introducing a weighted LOOCV approach based on the Best Linear Predictor of squared errors. By exploiting Gaussian process moments, the authors derive a BLFP estimator $\widehat{\mathsf{ISE}}_{BLP}(\eta_n)$ that weights squared LOOCV residuals to yield substantially more accurate ISE estimates than standard LOOCV, while also addressing covariate shift. The framework includes a BLUP-specific variant, a bias-corrected version, and a nugget-augmented extension for noisy observations; they also analyze independent and flat kernel limits and demonstrate robustness to kernel misspecification through extensive numerical experiments on environmental and piston models. The work provides a practical tool for reliable predictive performance assessment and model selection in GP-based computer experiments, with broad applicability to space-filling designs and GP-based predictors.

Abstract

We present a weighted version of Leave-One-Out (LOO) cross-validation for estimating the Integrated Squared Error (ISE) when approximating an unknown function by a predictor that depends linearly on evaluations of the function over a finite collection of sites. The method relies on the construction of the best linear estimator of the squared prediction error at an arbitrary unsampled site based on squared LOO residuals, assuming that the function is a realization of a Gaussian Process (GP). A theoretical analysis of performance of the ISE estimator is presented, and robustness with respect to the choice of the GP kernel is investigated first analytically, then through numerical examples. Overall, the estimation of ISE is significantly more precise than with classical, unweighted, LOO cross validation. Application to model selection is briefly considered through examples.

Weighted Leave-One-Out Cross Validation

TL;DR

The paper addresses estimating the Integrated Squared Prediction Error for a predictor that is linear in observed GP data, by introducing a weighted LOOCV approach based on the Best Linear Predictor of squared errors. By exploiting Gaussian process moments, the authors derive a BLFP estimator that weights squared LOOCV residuals to yield substantially more accurate ISE estimates than standard LOOCV, while also addressing covariate shift. The framework includes a BLUP-specific variant, a bias-corrected version, and a nugget-augmented extension for noisy observations; they also analyze independent and flat kernel limits and demonstrate robustness to kernel misspecification through extensive numerical experiments on environmental and piston models. The work provides a practical tool for reliable predictive performance assessment and model selection in GP-based computer experiments, with broad applicability to space-filling designs and GP-based predictors.

Abstract

We present a weighted version of Leave-One-Out (LOO) cross-validation for estimating the Integrated Squared Error (ISE) when approximating an unknown function by a predictor that depends linearly on evaluations of the function over a finite collection of sites. The method relies on the construction of the best linear estimator of the squared prediction error at an arbitrary unsampled site based on squared LOO residuals, assuming that the function is a realization of a Gaussian Process (GP). A theoretical analysis of performance of the ISE estimator is presented, and robustness with respect to the choice of the GP kernel is investigated first analytically, then through numerical examples. Overall, the estimation of ISE is significantly more precise than with classical, unweighted, LOO cross validation. Application to model selection is briefly considered through examples.

Paper Structure

This paper contains 39 sections, 61 equations, 21 figures, 3 tables.

Figures (21)

  • Figure 1: Left: $f(x)$ ( ---) and $\eta_n(x)$ for the designs $\mathbf{X}_n(0.015)$ ($\cdots$ with $\star$ and $\triangledown$) and $\mathbf{X}_n(0.1)$ ($\cdots$ with $\star$ and $\circ$). Right: $\log_{10}[\widehat{\mathsf{ISE}}_{LOO}(\eta_n)/\mathsf{ISE}(\eta_n)]$ (- - - with $\circ$) and $\log_{10}[\widehat{\mathsf{ISE}}_{BLP}(\eta_n)/\mathsf{ISE}(\eta_n)]$ ( - - - with $+$); $\log_{10}[\mathsf{E}\{\widehat{\mathsf{ISE}}_{LOO}(\eta_n)\}/\mathsf{IMSE}(\eta_n)]$ (--- with $\triangledown$) and $\log_{10}[\mathsf{E}\{\widehat{\mathsf{ISE}}_{BLP}(\eta_n)\}/\mathsf{IMSE}(\eta_n)]$ ( --- with $\star$) as functions of $\delta$.
  • Figure 2: $\log_{10}[\widehat{\mathsf{ISE}}_{LOO}(\eta_n)]$, $\log_{10}[\widehat{\mathsf{ISE}}_{BLP}(\eta_n)]$ and $\log_{10}[\mathsf{ISE}(\eta_n)]$ for the particular realization on the left panel of Figure \ref{['F:design-influence']}, and $\log_{10}[\mathsf{E}\{\widehat{\mathsf{ISE}}_{LOO}(\eta_n)\}]$, $\log_{10}[\mathsf{E}\{\widehat{\mathsf{ISE}}_{BLP}(\eta_n)\}]$ and $\log_{10}[\mathsf{E}\{\mathsf{ISE}(\eta_n)\}]$, as functions of $\theta_p\in[1,10]$.
  • Figure 3: Estimation of $\mathsf{ISE}(\eta_n)$ by $\widehat{\mathsf{ISE}}_{BLP}(\eta_n)$ when $\eta_n$ is a (non-interpolating) polynomial of total degree 9 with 100 design points forming a regular grid in $[0,1]^2$; $Y_\mathbf{x}\sim\mathsf{GP}(0,K_{3/2,10})$, $K^{(e)}=K_{3/2,\theta_{\rm BLP}}$, $\theta_{\rm BLP}\in[0.001,30]$.
  • Figure 4: Estimation of $\mathsf{ISE}(\eta_n)$ by $\widehat{\mathsf{ISE}}_{BLP}(\eta_n)$ when $\eta_n$ is the BLUP (simple-kriging predictor) for the model $\mathsf{GP}(0,K_{5/2,5})$ on $[0,1]^2$; $Y_\mathbf{x}\sim\mathsf{GP}(0,K_{3/2,10})$, $K^{(e)}=K_{3/2,\theta_{\rm BLP}}$, $\theta_{\rm BLP}\in[0.05,20]$ ($\mathbf{X}_n$ is a regular grid of 100 design points).
  • Figure 5: Same as Figure \ref{['F:ISE=M1-M52-theta5_d2_n100_M0-M32-theta10_M2-M32']} but for the unbiased estimator $\widehat{\mathsf{ISE}}_{BLUP}(\eta_n)$ of Section \ref{['S:ISE_BLUP']}.
  • ...and 16 more figures