Table of Contents
Fetching ...

Practical Efficient Global Optimization is No-regret

Jingyi Wang, Haowei Wang, Nai-Yuan Chiang, Juliane Mueller, Tucker Hartland, Cosmin G. Petra

Abstract

Efficient global optimization (EGO) is one of the most widely used noise-free Bayesian optimization algorithms.It comprises the Gaussian process (GP) surrogate model and expected improvement (EI) acquisition function. In practice, when EGO is applied, a scalar matrix of a small positive value (also called a nugget or jitter) is usually added to the covariance matrix of the deterministic GP to improve numerical stability. We refer to this EGO with a positive nugget as the practical EGO. Despite its wide adoption and empirical success, to date, cumulative regret bounds for practical EGO have yet to be established. In this paper, we present for the first time the cumulative regret upper bound of practical EGO. In particular, we show that practical EGO has sublinear cumulative regret bounds and thus is a no-regret algorithm for commonly used kernels including the squared exponential (SE) and Matérn kernels ($ν>\frac{1}{2}$). Moreover, we analyze the effect of the nugget on the regret bound and discuss the theoretical implication on its choice. Numerical experiments are conducted to support and validate our findings.

Practical Efficient Global Optimization is No-regret

Abstract

Efficient global optimization (EGO) is one of the most widely used noise-free Bayesian optimization algorithms.It comprises the Gaussian process (GP) surrogate model and expected improvement (EI) acquisition function. In practice, when EGO is applied, a scalar matrix of a small positive value (also called a nugget or jitter) is usually added to the covariance matrix of the deterministic GP to improve numerical stability. We refer to this EGO with a positive nugget as the practical EGO. Despite its wide adoption and empirical success, to date, cumulative regret bounds for practical EGO have yet to be established. In this paper, we present for the first time the cumulative regret upper bound of practical EGO. In particular, we show that practical EGO has sublinear cumulative regret bounds and thus is a no-regret algorithm for commonly used kernels including the squared exponential (SE) and Matérn kernels (). Moreover, we analyze the effect of the nugget on the regret bound and discuss the theoretical implication on its choice. Numerical experiments are conducted to support and validate our findings.

Paper Structure

This paper contains 20 sections, 22 theorems, 114 equations, 6 figures, 1 table, 1 algorithm.

Key Result

Lemma 3.2

The practical EGO generates the instantaneous regret bound where $c_{B\epsilon}(\epsilon,t)= \log^{1/2}\left(\frac{t+\epsilon}{2\pi \epsilon \tau^2(-B)}\right)$, $c_B=\frac{\tau(B)}{\tau(-B)}$ and $c_{B1}=\max\left\{\frac{\tau(B)}{\tau(-B)}-1,0\right\}$.

Figures (6)

  • Figure 1: Illustrative example of EI contour of the Branin function with 50 samples. From left to right: contour plots for $\epsilon=10^{-2}$, $\epsilon=10^{-6}$, $\epsilon=10^{-10}$, and no nugget. The maximum EI$_{50}$ value from left to right: $2.53\times 10^{-2}$, $1.90\times 10^{-4}$, $4.72\times 10^{-5}$, and $4.72\times 10^{-5}$.
  • Figure 2: Cumulative regret upper bound with nugget $\epsilon$ at different $T$ and selected constants for SE kernel. The case "other" means neither the conditions for case $1$ nor those for case $2$ are satisfied.
  • Figure 3: Median average cumulative regret bound for practical EGO with $\epsilon$ values $10^{-2}$, $10^{-4}$, and $10^{-6}$ for five examples. From top to bottom: Rosenbrock, Six-hump camel, Hartmann6, Branin, Michalewicz functions.
  • Figure 4: Cumulative regret upper bound with nugget $\epsilon$ at different $T$ and selected constants for Matérn kernel ($\nu=\frac{1}{2}$). The case "other" means neither the conditions for case $1$ nor those for case $2$ are satisfied.
  • Figure 5: Illustrative example of EI contour of the Branin function with 50 random samples. From left to right: contour plots for $\epsilon=10^{-2}$, $\epsilon=10^{-6}$, $\epsilon=10^{-10}$, and no nugget. The maximum EI$_{50}$ value from left to right: $0.40515$, $0.40085$, $0.40085$, and $0.40085$. The maximum level of the colorbar corresponds to the maximum of EI$_{50}$ of each plot.
  • ...and 1 more figures

Theorems & Definitions (44)

  • Lemma 3.2
  • Remark 3.3: Use of $\epsilon$ in Lemma \ref{['lem:ego-instregret-1']}
  • Remark 3.4: Exploitation term in instantaneous regret bound
  • Lemma 3.5
  • Theorem 3.6
  • Remark 3.7
  • Remark 4.1
  • Theorem 4.2
  • Theorem 4.3
  • Remark 4.4
  • ...and 34 more