Table of Contents
Fetching ...

Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds

Shion Takeno, Yu Inatsu, Masayuki Karasuyama, Ichiro Takeuchi

TL;DR

This work studies Bayesian optimization with a Gaussian process prior to derive tighter Bayesian regret bounds for posterior-sampling-based strategies. It establishes sub-linear BCR results for Thompson sampling and extends the analysis to a new hyperparameter-free acquisition, PIMS, showing it achieves the same BCR rate as TS and IRGP-UCB while mitigating practical issues like over-exploration and manual hyperparameter tuning. The paper also details a rigorous regret analysis for both finite and continuous input domains, including a discretization-free approach for continuous spaces, and demonstrates through extensive experiments that PIMS offers robust performance across synthetic, benchmark, and real-world emulators. Overall, the results suggest that randomized acquisition with data-driven confidence parameters can attain strong theoretical guarantees and improved practical performance in BO. These findings have implications for scalable, hyperparameter-free BO in settings with large or continuous domains.

Abstract

Among various acquisition functions (AFs) in Bayesian optimization (BO), Gaussian process upper confidence bound (GP-UCB) and Thompson sampling (TS) are well-known options with established theoretical properties regarding Bayesian cumulative regret (BCR). Recently, it has been shown that a randomized variant of GP-UCB achieves a tighter BCR bound compared with GP-UCB, which we call the tighter BCR bound for brevity. Inspired by this study, this paper first shows that TS achieves the tighter BCR bound. On the other hand, GP-UCB and TS often practically suffer from manual hyperparameter tuning and over-exploration issues, respectively. Therefore, we analyze yet another AF called a probability of improvement from the maximum of a sample path (PIMS). We show that PIMS achieves the tighter BCR bound and avoids the hyperparameter tuning, unlike GP-UCB. Furthermore, we demonstrate a wide range of experiments, focusing on the effectiveness of PIMS that mitigates the practical issues of GP-UCB and TS.

Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds

TL;DR

This work studies Bayesian optimization with a Gaussian process prior to derive tighter Bayesian regret bounds for posterior-sampling-based strategies. It establishes sub-linear BCR results for Thompson sampling and extends the analysis to a new hyperparameter-free acquisition, PIMS, showing it achieves the same BCR rate as TS and IRGP-UCB while mitigating practical issues like over-exploration and manual hyperparameter tuning. The paper also details a rigorous regret analysis for both finite and continuous input domains, including a discretization-free approach for continuous spaces, and demonstrates through extensive experiments that PIMS offers robust performance across synthetic, benchmark, and real-world emulators. Overall, the results suggest that randomized acquisition with data-driven confidence parameters can attain strong theoretical guarantees and improved practical performance in BO. These findings have implications for scalable, hyperparameter-free BO in settings with large or continuous domains.

Abstract

Among various acquisition functions (AFs) in Bayesian optimization (BO), Gaussian process upper confidence bound (GP-UCB) and Thompson sampling (TS) are well-known options with established theoretical properties regarding Bayesian cumulative regret (BCR). Recently, it has been shown that a randomized variant of GP-UCB achieves a tighter BCR bound compared with GP-UCB, which we call the tighter BCR bound for brevity. Inspired by this study, this paper first shows that TS achieves the tighter BCR bound. On the other hand, GP-UCB and TS often practically suffer from manual hyperparameter tuning and over-exploration issues, respectively. Therefore, we analyze yet another AF called a probability of improvement from the maximum of a sample path (PIMS). We show that PIMS achieves the tighter BCR bound and avoids the hyperparameter tuning, unlike GP-UCB. Furthermore, we demonstrate a wide range of experiments, focusing on the effectiveness of PIMS that mitigates the practical issues of GP-UCB and TS.
Paper Structure (26 sections, 20 theorems, 95 equations, 3 figures, 2 tables, 2 algorithms)

This paper contains 26 sections, 20 theorems, 95 equations, 3 figures, 2 tables, 2 algorithms.

Key Result

Lemma 2.1

BSR can be bounded from above as $\overline{\rm BSR}_T \leq \sum_{t=1}^T \overline{\rm BSR}_t / T \leq {\rm BCR}_T / T$ and ${\rm BSR}_T \leq {\rm BCR}_T / T$.

Figures (3)

  • Figure 1: The results on synthetic function experiments. The top row shows the average and standard error of the simple regret. In all the settings (a-c), we can confirm that several BO methods, including PIMS (blue), achieve the best convergence. In contrast, other theoretically guaranteed methods (GP-UCB, IRGP-UCB, TS) deteriorated in a certain setting. Therefore, we can observe that PIMS flexibly deals with various problem settings while keeping the theoretical guarantee and has superior or comparable performance compared with baselines, including heuristic methods without the theoretical guarantee, such as EI, MES, and JES. The bottom row represents the expectation and quantiles of $\beta_t^{1/2}$, $\zeta_t^{1/2}$, and $\xi_t$.
  • Figure 2: Average and standard error of the simple regret in benchmark function experiments.
  • Figure 3: Average and standard error of the obtained best value in real-world emulator experiments.

Theorems & Definitions (35)

  • Lemma 2.1
  • Definition 2.1: Maximum information gain
  • Lemma 3.1: Lemma 4.1 in Takeno2023-randomized
  • Lemma 3.2
  • Theorem 3.1
  • proof : short proof
  • Lemma 3.3: Lipschitz constants for posterior standard deviation
  • Theorem 3.2
  • Lemma 4.1
  • Theorem 4.1
  • ...and 25 more