Table of Contents
Fetching ...

Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency

Sudeep Salgia, Sattar Vakili, Qing Zhao

TL;DR

This work studies Bayesian optimization with Gaussian process priors through a non-adaptive random-exploration strategy. It proves that random sampling from a fixed distribution achieves order-optimal predictive accuracy in an infinite-dimensional RKHS and couples this with a domain-shrinking scheme (REDS) to obtain strong regret guarantees in both noiseless and noisy settings. Specifically, REDS attains $R(T)=\tilde{\mathcal{O}}(\max\{T^{(3-\beta)/2},1\})$ in the noiseless case and $R(T)=\tilde{\mathcal{O}}(\sqrt{T\gamma_T})$ in the noisy case, while avoiding expensive acquisition-function optimization. Empirically, REDS delivers similar regret to state-of-the-art methods with substantial speedups (up to ~100x) on common BO benchmarks, highlighting a practical, computation-friendly alternative for kernel-based bandit optimization.

Abstract

We consider Bayesian optimization using Gaussian Process models, also referred to as kernel-based bandit optimization. We study the methodology of exploring the domain using random samples drawn from a distribution. We show that this random exploration approach achieves the optimal error rates. Our analysis is based on novel concentration bounds in an infinite dimensional Hilbert space established in this work, which may be of independent interest. We further develop an algorithm based on random exploration with domain shrinking and establish its order-optimal regret guarantees under both noise-free and noisy settings. In the noise-free setting, our analysis closes the existing gap in regret performance and thereby resolves a COLT open problem. The proposed algorithm also enjoys a computational advantage over prevailing methods due to the random exploration that obviates the expensive optimization of a non-convex acquisition function for choosing the query points at each iteration.

Random Exploration in Bayesian Optimization: Order-Optimal Regret and Computational Efficiency

TL;DR

This work studies Bayesian optimization with Gaussian process priors through a non-adaptive random-exploration strategy. It proves that random sampling from a fixed distribution achieves order-optimal predictive accuracy in an infinite-dimensional RKHS and couples this with a domain-shrinking scheme (REDS) to obtain strong regret guarantees in both noiseless and noisy settings. Specifically, REDS attains in the noiseless case and in the noisy case, while avoiding expensive acquisition-function optimization. Empirically, REDS delivers similar regret to state-of-the-art methods with substantial speedups (up to ~100x) on common BO benchmarks, highlighting a practical, computation-friendly alternative for kernel-based bandit optimization.

Abstract

We consider Bayesian optimization using Gaussian Process models, also referred to as kernel-based bandit optimization. We study the methodology of exploring the domain using random samples drawn from a distribution. We show that this random exploration approach achieves the optimal error rates. Our analysis is based on novel concentration bounds in an infinite dimensional Hilbert space established in this work, which may be of independent interest. We further develop an algorithm based on random exploration with domain shrinking and establish its order-optimal regret guarantees under both noise-free and noisy settings. In the noise-free setting, our analysis closes the existing gap in regret performance and thereby resolves a COLT open problem. The proposed algorithm also enjoys a computational advantage over prevailing methods due to the random exploration that obviates the expensive optimization of a non-convex acquisition function for choosing the query points at each iteration.
Paper Structure (26 sections, 12 theorems, 76 equations, 1 figure, 1 table, 1 algorithm)

This paper contains 26 sections, 12 theorems, 76 equations, 1 figure, 1 table, 1 algorithm.

Key Result

Theorem 2.1

Steinwart2008MercerTheorem Let $\mathcal{X}$ be a compact metric space, $k: \mathcal{X} \times \mathcal{X} \to \mathbb{R}$ be a continuous kernel and $\varrho$ be a finite Borel measure supported on $\mathcal{X}$. Then, there exists an orthonormal system of functions $\{\varphi_j\}_{j \in \mathbb{N}

Figures (1)

  • Figure 1: Cumulative regret averaged over $10$ Monte Carlo runs for all algorithms across different benchmark functions. The shaded region represents the error bars upto one standard deviation. As evident from the plots, the regret of REDS is comparable to that of BPE and GP-ThreDS.

Theorems & Definitions (16)

  • Theorem 2.1
  • Definition 2.2
  • Theorem 3.1
  • proof
  • Lemma 3.2
  • Lemma 3.3
  • Lemma 3.4
  • Remark 3.5
  • Theorem 4.3
  • Corollary 4.4
  • ...and 6 more