Table of Contents
Fetching ...

Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima

Lan V. Truong

TL;DR

This work revisits the Track-and-Stop strategy and proposes a modified stopping rule that ensures instance-optimality even when the set of optimal arms is not a singleton, and demonstrates that the stopping rule tightly matches this bound.

Abstract

We study the problem of best-arm identification in stochastic multi-armed bandits under the fixed-confidence setting, with a particular focus on instances that admit multiple optimal arms. While the Track-and-Stop algorithm of Garivier and Kaufmann (2016) is widely conjectured to be instance-optimal, its performance in the presence of multiple optima has remained insufficiently understood. In this work, we revisit the Track-and-Stop strategy and propose a modified stopping rule that ensures instance-optimality even when the set of optimal arms is not a singleton. Our analysis introduces a new information-theoretic lower bound that explicitly accounts for multiple optimal arms, and we demonstrate that our stopping rule tightly matches this bound.

Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima

TL;DR

This work revisits the Track-and-Stop strategy and proposes a modified stopping rule that ensures instance-optimality even when the set of optimal arms is not a singleton, and demonstrates that the stopping rule tightly matches this bound.

Abstract

We study the problem of best-arm identification in stochastic multi-armed bandits under the fixed-confidence setting, with a particular focus on instances that admit multiple optimal arms. While the Track-and-Stop algorithm of Garivier and Kaufmann (2016) is widely conjectured to be instance-optimal, its performance in the presence of multiple optima has remained insufficiently understood. In this work, we revisit the Track-and-Stop strategy and propose a modified stopping rule that ensures instance-optimality even when the set of optimal arms is not a singleton. Our analysis introduces a new information-theoretic lower bound that explicitly accounts for multiple optimal arms, and we demonstrate that our stopping rule tightly matches this bound.

Paper Structure

This paper contains 21 sections, 7 theorems, 118 equations.

Key Result

Lemma 1

For the one-parameter exponential family, it holds that

Theorems & Definitions (12)

  • Lemma 1
  • proof
  • Lemma 2
  • Lemma 3
  • Theorem 4
  • Remark 5
  • proof
  • Theorem 6
  • proof
  • Lemma 7
  • ...and 2 more