Table of Contents
Fetching ...

A General Framework for User-Guided Bayesian Optimization

Carl Hvarfner, Frank Hutter, Luigi Nardi

TL;DR

This work addresses the inefficiency of Bayesian optimization when domain experts possess rich prior beliefs beyond kernel structure. It introduces ColaBO, a Bayesian-principled framework that injects user beliefs about function properties via a belief-weighted prior $p(f|\rho)$ and couples it with Monte Carlo acquisition functions such as EI and MES. The approach demonstrates accelerated optimization when priors are accurate and robustness when priors mislead, across synthetic benchmarks and real-world hyperparameter tuning tasks, while acknowledging increased computational costs. By enabling diverse priors over the optimizer, optimal value, and preferences within a general MC-efficiency BO setting, ColaBO broadens the practical applicability of Bayesian optimization to knowledge-rich domains.

Abstract

The optimization of expensive-to-evaluate black-box functions is prevalent in various scientific disciplines. Bayesian optimization is an automatic, general and sample-efficient method to solve these problems with minimal knowledge of the underlying function dynamics. However, the ability of Bayesian optimization to incorporate prior knowledge or beliefs about the function at hand in order to accelerate the optimization is limited, which reduces its appeal for knowledgeable practitioners with tight budgets. To allow domain experts to customize the optimization routine, we propose ColaBO, the first Bayesian-principled framework for incorporating prior beliefs beyond the typical kernel structure, such as the likely location of the optimizer or the optimal value. The generality of ColaBO makes it applicable across different Monte Carlo acquisition functions and types of user beliefs. We empirically demonstrate ColaBO's ability to substantially accelerate optimization when the prior information is accurate, and to retain approximately default performance when it is misleading.

A General Framework for User-Guided Bayesian Optimization

TL;DR

This work addresses the inefficiency of Bayesian optimization when domain experts possess rich prior beliefs beyond kernel structure. It introduces ColaBO, a Bayesian-principled framework that injects user beliefs about function properties via a belief-weighted prior and couples it with Monte Carlo acquisition functions such as EI and MES. The approach demonstrates accelerated optimization when priors are accurate and robustness when priors mislead, across synthetic benchmarks and real-world hyperparameter tuning tasks, while acknowledging increased computational costs. By enabling diverse priors over the optimizer, optimal value, and preferences within a general MC-efficiency BO setting, ColaBO broadens the practical applicability of Bayesian optimization to knowledge-rich domains.

Abstract

The optimization of expensive-to-evaluate black-box functions is prevalent in various scientific disciplines. Bayesian optimization is an automatic, general and sample-efficient method to solve these problems with minimal knowledge of the underlying function dynamics. However, the ability of Bayesian optimization to incorporate prior knowledge or beliefs about the function at hand in order to accelerate the optimization is limited, which reduces its appeal for knowledgeable practitioners with tight budgets. To allow domain experts to customize the optimization routine, we propose ColaBO, the first Bayesian-principled framework for incorporating prior beliefs beyond the typical kernel structure, such as the likely location of the optimizer or the optimal value. The generality of ColaBO makes it applicable across different Monte Carlo acquisition functions and types of user beliefs. We empirically demonstrate ColaBO's ability to substantially accelerate optimization when the prior information is accurate, and to retain approximately default performance when it is misleading.
Paper Structure (30 sections, 13 equations, 11 figures, 3 tables, 1 algorithm)

This paper contains 30 sections, 13 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: Three different ColaBO priors: (left) Prior over the optimum $\bm{x}_*$, and the induced changed in the GP for an optimum located in the green region. (middle) Prior over optimal value, $f^* < 0.8$. (right) Prior over preference relations $f(\bm{x})_1 \geq f(\bm{x}_2)$ for five preferences (green arrows, e.g. $f(0.0) \geq f(0.1) \geq f(0.2)$.
  • Figure 2: (Top left) Draws from the prior $p(f)$ (light blue) and the belief-weighted prior $p(f|\rho)$ whose members are likely to have their optimum within the green region. (Top right) Pathwise updated draws based on observed data. As the green region is distant from the observed data, samples are almost unaffected by the data in this region. (Bottom left) Exact mean and standard deviation ($\mu_{\bm{x}}, \sigma_{\bm{x}}$) of $p(f)$ and estimated mean and standard deviation of $p(f|\rho)$. (Bottom right) Exact $p(f|\mathcal{D}{})$ and estimated $p(f|\rho, \mathcal{D})$. As $p(f|\rho)$ constitutes of functions whose optimum is located within the green region the resulting model has a higher mean and lower variance within this region. Moreover, $p(f|\rho)$ globally displays lower upside variance compared to the vanilla GP.
  • Figure 3: (Top) Draws from $p(f|\mathcal{D}{})$ (light blue) and $p(f|\rho, \mathcal{D}{})$ with a prior $\rho$ located in the green region. (Bottom) Vanilla MC-EI and ColaBO MC-EI, resulting from computing the acquisition function from sample draws from $p(f|\rho, \mathcal{D}{})$.
  • Figure 4: Mean and $1/4$ standard deviation of MC-induced errors of ColaBO-LogEI relative vanilla LogEI as measured by the distance to the $\mathop{\mathrm{arg\,max}}$ of the acquisition function on Hartmann (3D) on 10 randomly sampled points for 40 seeds.
  • Figure 5: Performance on synthetic functions with well-located priors. Both ColaBO-LogEI and ColaBO-MES offer drastic speed-ups over their vanilla variants, and offer similar performance to $\pi$BO. The ranking of ColaBO acquisition functions are generally consistent with their respective vanilla variants. This is most prominent on Rosenbrock (6D), where ColaBO-MES struggles similarly to vanilla MES.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Definition 3.1: User Belief over Functions