Table of Contents
Fetching ...

Global Optimization of Gaussian processes

Artur M. Schweidtmann, Dominik Bongartz, Daniel Grothe, Tim Kerkenhoff, Xiaopeng Lin, Jaromil Najman, Alexander Mitsos

TL;DR

The paper tackles the challenge of globally optimizing problems with Gaussian process surrogates embedded, a task hampered by large model sizes in full-space formulations. It introduces a reduced-space (RS) formulation that substitutes GP predictions directly into the optimization, dramatically reducing variables and removing most equality constraints, while propagating McCormick relaxations through explicit GP models. A suite of tight relaxations is derived: envelopes for covariance functions (including Matérn with $ u\uparrow ext{1/2,3/2,5/2}$ and SE) and for acquisition functions such as lower confidence bound, probability of improvement, and expected improvement, enabling fast convergence in MAiNGO; these are complemented by modular relaxations for PDFs/CDFs. The method is demonstrated on scaling problems, chance-constrained optimization, and Bayesian optimization, achieving orders-of-magnitude speedups and enabling GP-based global optimization at data sizes that were previously prohibitive; the authors provide an open-source MeLOn toolbox integrated with MAiNGO. Overall, the RS approach substantially expands the practical applicability of deterministic global optimization with GP surrogates, supporting robust design under uncertainty and more efficient Bayesian optimization workflows.

Abstract

Gaussian processes~(Kriging) are interpolating data-driven models that are frequently applied in various disciplines. Often, Gaussian processes are trained on datasets and are subsequently embedded as surrogate models in optimization problems. These optimization problems are nonconvex and global optimization is desired. However, previous literature observed computational burdens limiting deterministic global optimization to Gaussian processes trained on few data points. We propose a reduced-space formulation for deterministic global optimization with trained Gaussian processes embedded. For optimization, the branch-and-bound solver branches only on the degrees of freedom and McCormick relaxations are propagated through explicit Gaussian process models. The approach also leads to significantly smaller and computationally cheaper subproblems for lower and upper bounding. To further accelerate convergence, we derive envelopes of common covariance functions for GPs and tight relaxations of acquisition functions used in Bayesian optimization including expected improvement, probability of improvement, and lower confidence bound. In total, we reduce computational time by orders of magnitude compared to state-of-the-art methods, thus overcoming previous computational burdens. We demonstrate the performance and scaling of the proposed method and apply it to Bayesian optimization with global optimization of the acquisition function and chance-constrained programming. The Gaussian process models, acquisition functions, and training scripts are available open-source within the "MeLOn - Machine Learning Models for Optimization" toolbox~(https://git.rwth-aachen.de/avt.svt/public/MeLOn).

Global Optimization of Gaussian processes

TL;DR

The paper tackles the challenge of globally optimizing problems with Gaussian process surrogates embedded, a task hampered by large model sizes in full-space formulations. It introduces a reduced-space (RS) formulation that substitutes GP predictions directly into the optimization, dramatically reducing variables and removing most equality constraints, while propagating McCormick relaxations through explicit GP models. A suite of tight relaxations is derived: envelopes for covariance functions (including Matérn with and SE) and for acquisition functions such as lower confidence bound, probability of improvement, and expected improvement, enabling fast convergence in MAiNGO; these are complemented by modular relaxations for PDFs/CDFs. The method is demonstrated on scaling problems, chance-constrained optimization, and Bayesian optimization, achieving orders-of-magnitude speedups and enabling GP-based global optimization at data sizes that were previously prohibitive; the authors provide an open-source MeLOn toolbox integrated with MAiNGO. Overall, the RS approach substantially expands the practical applicability of deterministic global optimization with GP surrogates, supporting robust design under uncertainty and more efficient Bayesian optimization workflows.

Abstract

Gaussian processes~(Kriging) are interpolating data-driven models that are frequently applied in various disciplines. Often, Gaussian processes are trained on datasets and are subsequently embedded as surrogate models in optimization problems. These optimization problems are nonconvex and global optimization is desired. However, previous literature observed computational burdens limiting deterministic global optimization to Gaussian processes trained on few data points. We propose a reduced-space formulation for deterministic global optimization with trained Gaussian processes embedded. For optimization, the branch-and-bound solver branches only on the degrees of freedom and McCormick relaxations are propagated through explicit Gaussian process models. The approach also leads to significantly smaller and computationally cheaper subproblems for lower and upper bounding. To further accelerate convergence, we derive envelopes of common covariance functions for GPs and tight relaxations of acquisition functions used in Bayesian optimization including expected improvement, probability of improvement, and lower confidence bound. In total, we reduce computational time by orders of magnitude compared to state-of-the-art methods, thus overcoming previous computational burdens. We demonstrate the performance and scaling of the proposed method and apply it to Bayesian optimization with global optimization of the acquisition function and chance-constrained programming. The Gaussian process models, acquisition functions, and training scripts are available open-source within the "MeLOn - Machine Learning Models for Optimization" toolbox~(https://git.rwth-aachen.de/avt.svt/public/MeLOn).

Paper Structure

This paper contains 27 sections, 31 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Illustration of the envelope of the Gaussian PDF
  • Figure 2: Graph of the probability of improvement acquisition function (PI) as in Equation \ref{['eq:ProbabilityOfImprovement']} for $f_\textup{min}=0$ along with the developed convex and concave relaxations. (a) On the interval $[-2,2]\times[0,10]$, the relaxations are constructed on the basis of monotonicity properties of PI. (b) On the interval $[1,2]\times[0,1]$, the relaxations are constructed on the basis of componentwise convexity properties via the methods of Meyer and Floudas Meyer.2005 and Najman et al. najman2019convex
  • Figure 3: Comparison of the total CPU time for optimization, i.e., the sum of preprocessing time and B&B time, of GPs with $k_{\nu=5/2}$ covariance function. The plots show the median of 50 repetitions of data generation, GP training, and optimization. Note that #points are incremented in steps of 10 and the lines are interpolations between them
  • Figure 4: Comparison of number of B&B iterations of optimization problems with GPs embedded with $k_{\nu=5/2}$ covariance function. The plots show the median of 50 repetitions of data generation, GP training, and optimization. Note that #points are incremented in steps of 10 and the lines are interpolations between them