Table of Contents
Fetching ...

Fixed Confidence Best Arm Identification in the Bayesian Setting

Kyoungseok Jang, Junpei Komiyama, Kazutoshi Yamazaki

TL;DR

This work reframes fixed-confidence best-arm identification under a Bayesian prior, showing that classical frequentist FC-BAI algorithms can be arbitrarily inefficient when the bandit model is drawn from a prior. It introduces a prior-dependent sample-complexity measure $L(\mathbf{H})$ via a Volume Lemma and proves a lower bound $\Omega( L(\mathbf{H})^2 / \delta )$ on the expected stopping time. To approach this bound, it proposes a Successive Elimination with Early-Stopping algorithm, achieving an upper bound of $O\left( \sigma_{\max}^2 \frac{L(\mathbf{H})^2}{\delta} \log\frac{L(\mathbf{H})}{\delta} \right)$ up to a polylog factor, and provides a matching lower bound up to logs. Simulations validate the theoretical claims, showing divergence of frequentist methods and the efficiency gains from elimination and early stopping in Bayesian FC-BAI. These results highlight a fundamental difference between Bayesian and frequentist FC-BAI and offer a practical, near-optimal algorithm for Bayesian settings.

Abstract

We consider the fixed-confidence best arm identification (FC-BAI) problem in the Bayesian setting. This problem aims to find the arm of the largest mean with a fixed confidence level when the bandit model has been sampled from the known prior. Most studies on the FC-BAI problem have been conducted in the frequentist setting, where the bandit model is predetermined before the game starts. We show that the traditional FC-BAI algorithms studied in the frequentist setting, such as track-and-stop and top-two algorithms, result in arbitrarily suboptimal performances in the Bayesian setting. We also obtain a lower bound of the expected number of samples in the Bayesian setting and introduce a variant of successive elimination that has a matching performance with the lower bound up to a logarithmic factor. Simulations verify the theoretical results.

Fixed Confidence Best Arm Identification in the Bayesian Setting

TL;DR

This work reframes fixed-confidence best-arm identification under a Bayesian prior, showing that classical frequentist FC-BAI algorithms can be arbitrarily inefficient when the bandit model is drawn from a prior. It introduces a prior-dependent sample-complexity measure via a Volume Lemma and proves a lower bound on the expected stopping time. To approach this bound, it proposes a Successive Elimination with Early-Stopping algorithm, achieving an upper bound of up to a polylog factor, and provides a matching lower bound up to logs. Simulations validate the theoretical claims, showing divergence of frequentist methods and the efficiency gains from elimination and early stopping in Bayesian FC-BAI. These results highlight a fundamental difference between Bayesian and frequentist FC-BAI and offer a practical, near-optimal algorithm for Bayesian settings.

Abstract

We consider the fixed-confidence best arm identification (FC-BAI) problem in the Bayesian setting. This problem aims to find the arm of the largest mean with a fixed confidence level when the bandit model has been sampled from the known prior. Most studies on the FC-BAI problem have been conducted in the frequentist setting, where the bandit model is predetermined before the game starts. We show that the traditional FC-BAI algorithms studied in the frequentist setting, such as track-and-stop and top-two algorithms, result in arbitrarily suboptimal performances in the Bayesian setting. We also obtain a lower bound of the expected number of samples in the Bayesian setting and introduce a variant of successive elimination that has a matching performance with the lower bound up to a logarithmic factor. Simulations verify the theoretical results.
Paper Structure (39 sections, 22 theorems, 105 equations, 6 tables, 2 algorithms)

This paper contains 39 sections, 22 theorems, 105 equations, 6 tables, 2 algorithms.

Key Result

Lemma 1

For $\Delta \in (0,1)$, let Then, $\lim_{\Delta \to 0^+} L({\bm{H}},\Delta) = L(\bm{H})$. Especially, for $\Delta<\frac{L(H)}{\sum_{i \in [k]}\frac{2(k-1)}{\xi_i}}$, $L(H,\Delta)\in (\frac{1}{2}L(H), 2L(H))$.

Theorems & Definitions (44)

  • Example 1
  • Definition 1
  • Definition 2
  • Lemma 1: Volume Lemma, informal
  • Definition 3: Frequentist $\delta$-correctness
  • Theorem 2
  • Corollary 3: kaufmann2014complexity
  • Theorem 4
  • Lemma 5: kaufman16a, Lemma 1
  • Remark 1
  • ...and 34 more