Table of Contents
Fetching ...

Conformal Approach To Gaussian Process Surrogate Evaluation With Coverage Guarantees

Edgar Jaber, Vincent Blot, Nicolas Brunel, Vincent Chabridon, Emmanuel Remy, Bertrand Iooss, Didier Lucor, Mathilde Mougeot, Alessandro Leite

TL;DR

Gaussian process surrogates for expensive simulations rely on Gaussian-based credibility intervals that may misrepresent uncertainty under misspecification. The paper introduces cross-conformal predictors for GPs (J+GP and J-minmax-GP), weighting the non-conformity score by the GP posterior spread to yield adaptive, distribution-free prediction intervals with marginal coverage guarantees. The authors prove theoretical coverage properties, demonstrate strong adaptivity via correlation between interval width and local surrogate error, and provide a public implementation evaluated on ML benchmarks and an industrial nuclear-engineering use case. This approach offers a practical, reliability-enhancing tool for GP model evaluation and kernel selection in high-cost UQ settings, reducing dependence on Gaussian assumptions while preserving rigorous guarantees.

Abstract

Gaussian processes (GPs) are a Bayesian machine learning approach widely used to construct surrogate models for the uncertainty quantification of computer simulation codes in industrial applications. It provides both a mean predictor and an estimate of the posterior prediction variance, the latter being used to produce Bayesian credibility intervals. Interpreting these intervals relies on the Gaussianity of the simulation model as well as the well-specification of the priors which are not always appropriate. We propose to address this issue with the help of conformal prediction. In the present work, a method for building adaptive cross-conformal prediction intervals is proposed by weighting the non-conformity score with the posterior standard deviation of the GP. The resulting conformal prediction intervals exhibit a level of adaptivity akin to Bayesian credibility sets and display a significant correlation with the surrogate model local approximation error, while being free from the underlying model assumptions and having frequentist coverage guarantees. These estimators can thus be used for evaluating the quality of a GP surrogate model and can assist a decision-maker in the choice of the best prior for the specific application of the GP. The performance of the method is illustrated through a panel of numerical examples based on various reference databases. Moreover, the potential applicability of the method is demonstrated in the context of surrogate modeling of an expensive-to-evaluate simulator of the clogging phenomenon in steam generators of nuclear reactors.

Conformal Approach To Gaussian Process Surrogate Evaluation With Coverage Guarantees

TL;DR

Gaussian process surrogates for expensive simulations rely on Gaussian-based credibility intervals that may misrepresent uncertainty under misspecification. The paper introduces cross-conformal predictors for GPs (J+GP and J-minmax-GP), weighting the non-conformity score by the GP posterior spread to yield adaptive, distribution-free prediction intervals with marginal coverage guarantees. The authors prove theoretical coverage properties, demonstrate strong adaptivity via correlation between interval width and local surrogate error, and provide a public implementation evaluated on ML benchmarks and an industrial nuclear-engineering use case. This approach offers a practical, reliability-enhancing tool for GP model evaluation and kernel selection in high-cost UQ settings, reducing dependence on Gaussian assumptions while preserving rigorous guarantees.

Abstract

Gaussian processes (GPs) are a Bayesian machine learning approach widely used to construct surrogate models for the uncertainty quantification of computer simulation codes in industrial applications. It provides both a mean predictor and an estimate of the posterior prediction variance, the latter being used to produce Bayesian credibility intervals. Interpreting these intervals relies on the Gaussianity of the simulation model as well as the well-specification of the priors which are not always appropriate. We propose to address this issue with the help of conformal prediction. In the present work, a method for building adaptive cross-conformal prediction intervals is proposed by weighting the non-conformity score with the posterior standard deviation of the GP. The resulting conformal prediction intervals exhibit a level of adaptivity akin to Bayesian credibility sets and display a significant correlation with the surrogate model local approximation error, while being free from the underlying model assumptions and having frequentist coverage guarantees. These estimators can thus be used for evaluating the quality of a GP surrogate model and can assist a decision-maker in the choice of the best prior for the specific application of the GP. The performance of the method is illustrated through a panel of numerical examples based on various reference databases. Moreover, the potential applicability of the method is demonstrated in the context of surrogate modeling of an expensive-to-evaluate simulator of the clogging phenomenon in steam generators of nuclear reactors.
Paper Structure (29 sections, 2 theorems, 60 equations, 8 figures, 13 tables)

This paper contains 29 sections, 2 theorems, 60 equations, 8 figures, 13 tables.

Key Result

Theorem 1

Assume $\mathcal{D}_{n}$ is exchangeable. For a new point $X_{n+1}\in\mathcal{X}$ and a coverage level $1 - \alpha\in(0,1)$:

Figures (8)

  • Figure 1: GP regression metamodeling illustration. The data obtained from the numerical code is modeled with a prior GP and is then conditioned by the data. In the absence of noise, the posterior process interpolates the data and produces credibility intervals.
  • Figure 2: Illustration of the different cross-CP methods.
  • Figure 3: Boxplots of the bootstrapped Spearman correlations obtained for the different methods used to regress the CPU dataset.
  • Figure 4: Boxplots of the bootstrapped Spearman correlations obtained for the different methods used to regress the noisy Morokoff & Caflisch function.
  • Figure 5: Boxplots of the Spearman correlations obtained for the different methods used to regress the THYC-Puffer-DEPOTHYC industrial testcase.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Theorem 1
  • Theorem 2
  • proof
  • proof