Table of Contents
Fetching ...

Minimax and Bayes Optimal Best-Arm Identification

Masahiro Kato

TL;DR

The paper tackles fixed-budget best-arm identification by introducing the TS-EBA strategy, which uses a two-stage sampling approach and empirical best-arm recommendation. It provides tight minimax and Bayes lower bounds for simple regret and shows that TS-EBA achieves matching upper bounds, including constants, establishing exact asymptotic optimality in both frameworks. The analysis relies on mean-parameterized exponential-family bandit models, variance-informed Neyman-like sampling, and a rigorous change-of-measure technique to derive lower bounds. The results extend to Bernoulli and other distributions, and include a Bernoulli-specific simplification. Overall, the work unifies minimax and Bayesian optimality for fixed-budget BAI and delivers a practical strategy with theoretically optimal performance guarantees.

Abstract

This study investigates minimax and Bayes optimal strategies in fixed-budget best-arm identification. We consider an adaptive procedure consisting of a sampling phase followed by a recommendation phase, and we design an adaptive experiment within this framework to efficiently identify the best arm, defined as the one with the highest expected outcome. In our proposed strategy, the sampling phase consists of two stages. The first stage is a pilot phase, in which we allocate each arm uniformly in equal proportions to eliminate clearly suboptimal arms and estimate outcome variances. In the second stage, arms are allocated in proportion to the variances estimated during the first stage. After the sampling phase, the procedure enters the recommendation phase, where we select the arm with the highest sample mean as our estimate of the best arm. We prove that this single strategy is simultaneously asymptotically minimax and Bayes optimal for the simple regret, with upper bounds that coincide exactly with our lower bounds, including the constant terms.

Minimax and Bayes Optimal Best-Arm Identification

TL;DR

The paper tackles fixed-budget best-arm identification by introducing the TS-EBA strategy, which uses a two-stage sampling approach and empirical best-arm recommendation. It provides tight minimax and Bayes lower bounds for simple regret and shows that TS-EBA achieves matching upper bounds, including constants, establishing exact asymptotic optimality in both frameworks. The analysis relies on mean-parameterized exponential-family bandit models, variance-informed Neyman-like sampling, and a rigorous change-of-measure technique to derive lower bounds. The results extend to Bernoulli and other distributions, and include a Bernoulli-specific simplification. Overall, the work unifies minimax and Bayesian optimality for fixed-budget BAI and delivers a practical strategy with theoretically optimal performance guarantees.

Abstract

This study investigates minimax and Bayes optimal strategies in fixed-budget best-arm identification. We consider an adaptive procedure consisting of a sampling phase followed by a recommendation phase, and we design an adaptive experiment within this framework to efficiently identify the best arm, defined as the one with the highest expected outcome. In our proposed strategy, the sampling phase consists of two stages. The first stage is a pilot phase, in which we allocate each arm uniformly in equal proportions to eliminate clearly suboptimal arms and estimate outcome variances. In the second stage, arms are allocated in proportion to the variances estimated during the first stage. After the sampling phase, the procedure enters the recommendation phase, where we select the arm with the highest sample mean as our estimate of the best arm. We prove that this single strategy is simultaneously asymptotically minimax and Bayes optimal for the simple regret, with upper bounds that coincide exactly with our lower bounds, including the constant terms.

Paper Structure

This paper contains 80 sections, 21 theorems, 163 equations, 1 table, 1 algorithm.

Key Result

Proposition 4.2

For any $P_\mu\in{\mathcal{P}}(\sigma^2,{\mathcal{M}},{\mathcal{Y}})$, the following holds:

Theorems & Definitions (39)

  • Definition 4.1: Mean-parameterized canonical exponential family
  • Example : Examples of the mean-parameterized exponential family
  • Proposition 4.2
  • Remark
  • Example : Bandit instances
  • Definition 5.1: Regular strategies
  • Example : Central limit theorem
  • Theorem 5.2: Minimax lower bound
  • Theorem 5.4: Bayes lower bound
  • Theorem 6.1
  • ...and 29 more