Table of Contents
Fetching ...

Relaxed Gaussian process interpolation: a goal-oriented approach to Bayesian optimization

Sébastien Petit, Julien Bect, Emmanuel Vazquez

TL;DR

The paper introduces relaxed Gaussian process interpolation (reGP), a goal-oriented approach that relaxes interpolation constraints outside a region of interest to yield sharper predictive distributions where low function values matter, notably in Bayesian optimization. It formalizes reGP as a Gaussian predictive model conditioned on the mode of relaxed constraints, and couples hyperparameter estimation with relaxation via joint likelihood optimization, including extensions to noisy data. A truncated CRPS-based method (tCRPS) is proposed to automatically select the relaxation range, and theoretical convergence guarantees are provided for EI-based optimization when the target function lies in the RKHS of the underlying covariance. Empirical benchmarks show that reGP can substantially improve optimization performance on challenging functions, at the cost of additional computation, and the framework is extended to noisy settings and UCB-based strategies with open-source implementations available. The work highlights a practical, theory-backed pathway to goal-oriented probabilistic modeling that prioritizes predictive quality in regions of interest for improved sequential decision making.

Abstract

This work presents a new procedure for obtaining predictive distributions in the context of Gaussian process (GP) modeling, with a relaxation of the interpolation constraints outside ranges of interest: the mean of the predictive distributions no longer necessarily interpolates the observed values when they are outside ranges of interest, but are simply constrained to remain outside. This method called relaxed Gaussian process (reGP) interpolation provides better predictive distributions in ranges of interest, especially in cases where a stationarity assumption for the GP model is not appropriate. It can be viewed as a goal-oriented method and becomes particularly interesting in Bayesian optimization, for example, for the minimization of an objective function, where good predictive distributions for low function values are important. When the expected improvement criterion and reGP are used for sequentially choosing evaluation points, the convergence of the resulting optimization algorithm is theoretically guaranteed (provided that the function to be optimized lies in the reproducing kernel Hilbert space attached to the known covariance of the underlying Gaussian process). Experiments indicate that using reGP instead of stationary GP models in Bayesian optimization is beneficial.

Relaxed Gaussian process interpolation: a goal-oriented approach to Bayesian optimization

TL;DR

The paper introduces relaxed Gaussian process interpolation (reGP), a goal-oriented approach that relaxes interpolation constraints outside a region of interest to yield sharper predictive distributions where low function values matter, notably in Bayesian optimization. It formalizes reGP as a Gaussian predictive model conditioned on the mode of relaxed constraints, and couples hyperparameter estimation with relaxation via joint likelihood optimization, including extensions to noisy data. A truncated CRPS-based method (tCRPS) is proposed to automatically select the relaxation range, and theoretical convergence guarantees are provided for EI-based optimization when the target function lies in the RKHS of the underlying covariance. Empirical benchmarks show that reGP can substantially improve optimization performance on challenging functions, at the cost of additional computation, and the framework is extended to noisy settings and UCB-based strategies with open-source implementations available. The work highlights a practical, theory-backed pathway to goal-oriented probabilistic modeling that prioritizes predictive quality in regions of interest for improved sequential decision making.

Abstract

This work presents a new procedure for obtaining predictive distributions in the context of Gaussian process (GP) modeling, with a relaxation of the interpolation constraints outside ranges of interest: the mean of the predictive distributions no longer necessarily interpolates the observed values when they are outside ranges of interest, but are simply constrained to remain outside. This method called relaxed Gaussian process (reGP) interpolation provides better predictive distributions in ranges of interest, especially in cases where a stationarity assumption for the GP model is not appropriate. It can be viewed as a goal-oriented method and becomes particularly interesting in Bayesian optimization, for example, for the minimization of an objective function, where good predictive distributions for low function values are important. When the expected improvement criterion and reGP are used for sequentially choosing evaluation points, the convergence of the resulting optimization algorithm is theoretically guaranteed (provided that the function to be optimized lies in the reproducing kernel Hilbert space attached to the known covariance of the underlying Gaussian process). Experiments indicate that using reGP instead of stationary GP models in Bayesian optimization is beneficial.
Paper Structure (43 sections, 28 theorems, 115 equations, 22 figures, 2 tables, 1 algorithm)

This paper contains 43 sections, 28 theorems, 115 equations, 22 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

jones1998:_egovazquez_and_bect2010:_convergence_ei[proposition]prop:ei The EI criterion may be written as $\rho_n(x) = \gamma \left( m_n - \mu_n(x), \, \sigma_n^2(x) \right)$ with where $\phi$ and $\Phi$ stand for the probability density and cumulative distribution functions of the standard Gaussian distribution. Moreover, the function $\gamma$ is continuous, satisfies $\gamma(z, \, s) > 0$ if $s

Figures (22)

  • Figure 1: Left: the Steep function. Right: same illustration with a restrained range on the $y$-axis. The variations on the left tend to overshadow the global minimum on the right.
  • Figure 2: Left: GP fit on the Steep function. Right: same illustration with a restrained range on the $y$-axis. The squares represent the data. The red line represents the posterior mean $\mu_n$ given by the model and the gray envelopes represent the associated uncertainties.
  • Figure 3: Left: prediction of the Steep function with the proposed methodology (black line: relaxation threshold $t$; blue points: relaxed observations). Right: $\mu_n$ versus $f$ (with more observations for illustration purposes). The model interpolates the data below $t$. The blue points are relaxed observations.
  • Figure 4: An example of reGP predictive distribution with $R = \left( - \infty, -1 \right] \cup \left[1, + \infty \right)$ on a function $f$ represented in dashed black lines. The solid black lines represent the relaxation thresholds. The problem \ref{['eq:rgpi_problem']} was solved only in $\underline{z}$ as the parameters of the (constant) mean and ($\nu=5/2$ Matérn) covariance functions were held fixed for illustration purposes.
  • Figure 5: Illustration of the choice of a relaxation range. The range of interest $Q$ is determined by the threshold $t^{(0)}$. The relaxation range $R$ corresponding to the region above $t$ has been obtained by the procedure described in \ref{['sec:choos-relax-set']}.
  • ...and 17 more figures

Theorems & Definitions (40)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Definition 5: Relaxed-GP predictive distribution; fixed $\mu$ and $k$
  • Definition 6: Relaxed-GP predictive distribution; estimated parameters
  • Remark 7: On minimizing \ref{['eq:rgpi_problem']} jointly
  • Remark 8: Numerical details
  • Proposition 10
  • Corollary 11
  • ...and 30 more