Table of Contents
Fetching ...

Optimal Initialization of Batch Bayesian Optimization

Jiuge Ren, David Sweet

TL;DR

This work tackles the efficiency of batch Bayesian optimization in low-budget settings by introducing Minimal Terminal Variance (MTV), an acquisition function that optimizes an initial batch and all subsequent batches. MTV embodies an I-Optimality concept weighted by the probability that a point is optimal, $p_*(x)$, and evaluates the objective via a Monte Carlo approximation of the terminal GP variance $ abla \,\sigma^2(x|x_a)$ with respect to batch arms $x_a$, using fantasized GPs for fast computations. The method relies on three pillars: sampling from $p_*(x)$ with a problem-specific MCMC, minimizing the integral approximation of MTV, and careful initialization of the acquisition function optimizer. Empirical results on standard test functions and reinforcement learning simulators show that MTV outperforms common initialization and batch-design baselines across dimensions and problem types, and it remains compatible with ensemble approaches, offering a practical pathway to more informative experiments in both field and simulation contexts.

Abstract

Field experiments and computer simulations are effective but time-consuming methods of measuring the quality of engineered systems at different settings. To reduce the total time required, experimenters may employ Bayesian optimization, which is parsimonious with measurements, and take measurements of multiple settings simultaneously, in a batch. In practice, experimenters use very few batches, thus, it is imperative that each batch be as informative as possible. Typically, the initial batch in a Batch Bayesian Optimization (BBO) is constructed from a quasi-random sample of settings values. We propose a batch-design acquisition function, Minimal Terminal Variance (MTV), that designs a batch by optimization rather than random sampling. MTV adapts a design criterion function from Design of Experiments, called I-Optimality, which minimizes the variance of the post-evaluation estimates of quality, integrated over the entire space of settings. MTV weights the integral by the probability that a setting is optimal, making it able to design not only an initial batch but all subsequent batches, as well. Applicability to both initialization and subsequent batches is novel among acquisition functions. Numerical experiments on test functions and simulators show that MTV compares favorably to other BBO methods.

Optimal Initialization of Batch Bayesian Optimization

TL;DR

This work tackles the efficiency of batch Bayesian optimization in low-budget settings by introducing Minimal Terminal Variance (MTV), an acquisition function that optimizes an initial batch and all subsequent batches. MTV embodies an I-Optimality concept weighted by the probability that a point is optimal, , and evaluates the objective via a Monte Carlo approximation of the terminal GP variance with respect to batch arms , using fantasized GPs for fast computations. The method relies on three pillars: sampling from with a problem-specific MCMC, minimizing the integral approximation of MTV, and careful initialization of the acquisition function optimizer. Empirical results on standard test functions and reinforcement learning simulators show that MTV outperforms common initialization and batch-design baselines across dimensions and problem types, and it remains compatible with ensemble approaches, offering a practical pathway to more informative experiments in both field and simulation contexts.

Abstract

Field experiments and computer simulations are effective but time-consuming methods of measuring the quality of engineered systems at different settings. To reduce the total time required, experimenters may employ Bayesian optimization, which is parsimonious with measurements, and take measurements of multiple settings simultaneously, in a batch. In practice, experimenters use very few batches, thus, it is imperative that each batch be as informative as possible. Typically, the initial batch in a Batch Bayesian Optimization (BBO) is constructed from a quasi-random sample of settings values. We propose a batch-design acquisition function, Minimal Terminal Variance (MTV), that designs a batch by optimization rather than random sampling. MTV adapts a design criterion function from Design of Experiments, called I-Optimality, which minimizes the variance of the post-evaluation estimates of quality, integrated over the entire space of settings. MTV weights the integral by the probability that a setting is optimal, making it able to design not only an initial batch but all subsequent batches, as well. Applicability to both initialization and subsequent batches is novel among acquisition functions. Numerical experiments on test functions and simulators show that MTV compares favorably to other BBO methods.
Paper Structure (15 sections, 8 equations, 8 figures, 2 algorithms)

This paper contains 15 sections, 8 equations, 8 figures, 2 algorithms.

Figures (8)

  • Figure 1: The top row shows the construction of a single MTV batch: (a) Find the maximizer (marked X) of $\mu(x)$, the GP mean. Contour lines show $\mu(x)$. (b) Draw $10B=40$ points, $x_i \sim p_*(x)$, via Algorithm \ref{['alg:mcmc']}. (c) Jointly determine the $B=4$ arms, $x_a$, that minimize (over $x_a$) the mean (over $x_i$) terminal GP variance, $\sigma^2(x|x_a)$, via Algorithm \ref{['alg:MTV']}. The bottom row shows the batches of arms chosen at: (d) round zero, initialization, and improvement rounds (e) one and (f) two (the round in the top row). Contour lines show values of the true function, $f(x)$.
  • Figure 2: Comparison of MTV to other batch Bayesian optimization methods on the 3D Ackley function. (a) Results are averaged over 30 runs. Each run randomly distorts the Ackley function to mitigate center bias (see Section \ref{['sec.performance-comparison']}). (b) Results are range-normalized across optimization methods before averaging. (c) The final-round results are summarized.
  • Figure 3: Comparison of MTV to several other acquisition functions and two baselines (sobol and random).
  • Figure 4: Optimizing a 3-dimensional controller for the mountain car RL simulator with 5 arms/batch. We run 100 replications to calculate error bars. (a) The y axis shows the return (total episode reward). (b) The y axis is range-normalized across optimization methods. (c) The range-normalized value of the final round.
  • Figure 5: Optimizing a 34-dimensional controller for the a one-legged hopping robot RL simulator with 30 arms / batch. (a) Return of episode. (b) Range-normalized return. (c) The range-normalized value of the final round.
  • ...and 3 more figures