Table of Contents
Fetching ...

On Regret Bounds of Thompson Sampling for Bayesian Optimization

Shion Takeno, Shogo Iwazaki

TL;DR

Several regret bounds are shown, including a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on $1/\delta$ with probability $\delta$, and an upper bound of the second moment of cumulative regret which directly suggests an improved regret upper bound on $\delta$.

Abstract

We study a widely used Bayesian optimization method, Gaussian process Thompson sampling (GP-TS), under the assumption that the objective function is a sample path from a GP. Compared with the GP upper confidence bound (GP-UCB) with established high-probability and expected regret bounds, most analyses of GP-TS have been limited to expected regret. Moreover, whether the recent analyses of GP-UCB for the lenient regret and the improved cumulative regret upper bound can be applied to GP-TS remains unclear. To fill these gaps, this paper shows several regret bounds: (i) a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on $1/δ$ with probability $δ$, (ii) an upper bound of the second moment of cumulative regret, which directly suggests an improved regret upper bound on $δ$, (iii) expected lenient regret upper bounds, and (iv) an improved cumulative regret upper bound on the time horizon $T$. Along the way, we provide several useful lemmas, including a relaxation of the necessary condition from recent analysis to obtain improved regret upper bounds on $T$.

On Regret Bounds of Thompson Sampling for Bayesian Optimization

TL;DR

Several regret bounds are shown, including a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on with probability , and an upper bound of the second moment of cumulative regret which directly suggests an improved regret upper bound on .

Abstract

We study a widely used Bayesian optimization method, Gaussian process Thompson sampling (GP-TS), under the assumption that the objective function is a sample path from a GP. Compared with the GP upper confidence bound (GP-UCB) with established high-probability and expected regret bounds, most analyses of GP-TS have been limited to expected regret. Moreover, whether the recent analyses of GP-UCB for the lenient regret and the improved cumulative regret upper bound can be applied to GP-TS remains unclear. To fill these gaps, this paper shows several regret bounds: (i) a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on with probability , (ii) an upper bound of the second moment of cumulative regret, which directly suggests an improved regret upper bound on , (iii) expected lenient regret upper bounds, and (iv) an improved cumulative regret upper bound on the time horizon . Along the way, we provide several useful lemmas, including a relaxation of the necessary condition from recent analysis to obtain improved regret upper bounds on .
Paper Structure (34 sections, 19 theorems, 169 equations, 1 algorithm)

This paper contains 34 sections, 19 theorems, 169 equations, 1 algorithm.

Key Result

Lemma 2.3

Let $k: \mathbb{R}^d \times \mathbb{R}^d \to \mathbb{R}$ be linear, SE, or Matérn kernel with $\nu > 1$ and $k(\boldsymbol{x}, \boldsymbol{x}) \leq 1$ for all $\boldsymbol{x} \in \mathbb{R}^d$. Moreover, assume that a noise variance $\sigma^2$ is positive. Then, for any $t \geq 1$ and ${\cal D}_{t-1 where $L_{\sigma}$ is a positive constant given by

Theorems & Definitions (33)

  • Lemma 2.3: Lipschitz constants for posterior standard deviation
  • Lemma 2.4: Conditions on the global maximizer of sample path
  • Definition 2.5: Maximum information gain
  • Theorem 3.1
  • Theorem 3.2: Informal
  • Theorem 3.3
  • Lemma 3.4
  • Theorem 3.5
  • proof
  • Lemma A.1: Lemma D.1 of takeno2025-regret
  • ...and 23 more