Optimal Top-Two Method for Best Arm Identification and Fluid Analysis

Agniv Bandyopadhyay; Sandeep Juneja; Shubhada Agrawal

Optimal Top-Two Method for Best Arm Identification and Fluid Analysis

Agniv Bandyopadhyay, Sandeep Juneja, Shubhada Agrawal

TL;DR

The paper tackles fixed-confidence best-arm identification with arms drawn from a SPEF, proposing Anchored Top-2 (AT2) that uses an anchor function g to decide between sampling the empirical winner and the challenger. A fluid dynamics/IFT framework is developed to characterize the asymptotic sampling-path and optimal allocations, enabling rigorous proof of asymptotic optimality as δ→0. The main contributions are AT2 and IAT2 with provable δ-correctness and tight sample-complexity constants, a novel fluid limit description that yields convergence of allocations to the optimal ω*, and substantial empirical gains over existing top-2 and track-and-stop methods. Together, these results provide a computationally efficient, theoretically grounded approach to BAI with strong practical implications for mean-estimation-based decision problems in healthcare, recommendations, and more.

Abstract

Top-$2$ methods have become popular in solving the best arm identification (BAI) problem. The best arm, or the arm with the largest mean amongst finitely many, is identified through an algorithm that at any sequential step independently pulls the empirical best arm, with a fixed probability $β$, and pulls the best challenger arm otherwise. The probability of incorrect selection is guaranteed to lie below a specified $δ>0$. Information theoretic lower bounds on sample complexity are well known for BAI problem and are matched asymptotically as $δ\rightarrow 0$ by computationally demanding plug-in methods. The above top 2 algorithm for any $β\in (0,1)$ has sample complexity within a constant of the lower bound. However, determining the optimal $β$ that matches the lower bound has proven difficult. In this paper, we address this and propose an optimal top-2 type algorithm. We consider a function of allocations anchored at a threshold. If it exceeds the threshold then the algorithm samples the empirical best arm. Otherwise, it samples the challenger arm. We show that the proposed algorithm is optimal as $δ\rightarrow 0$. Our analysis relies on identifying a limiting fluid dynamics of allocations that satisfy a series of ordinary differential equations pasted together and that describe the asymptotic path followed by our algorithm. We rely on the implicit function theorem to show existence and uniqueness of these fluid ode's and to show that the proposed algorithm remains close to the ode solution.

Optimal Top-Two Method for Best Arm Identification and Fluid Analysis

TL;DR

Abstract

Top-

methods have become popular in solving the best arm identification (BAI) problem. The best arm, or the arm with the largest mean amongst finitely many, is identified through an algorithm that at any sequential step independently pulls the empirical best arm, with a fixed probability

, and pulls the best challenger arm otherwise. The probability of incorrect selection is guaranteed to lie below a specified

. Information theoretic lower bounds on sample complexity are well known for BAI problem and are matched asymptotically as

by computationally demanding plug-in methods. The above top 2 algorithm for any

has sample complexity within a constant of the lower bound. However, determining the optimal

that matches the lower bound has proven difficult. In this paper, we address this and propose an optimal top-2 type algorithm. We consider a function of allocations anchored at a threshold. If it exceeds the threshold then the algorithm samples the empirical best arm. Otherwise, it samples the challenger arm. We show that the proposed algorithm is optimal as

. Our analysis relies on identifying a limiting fluid dynamics of allocations that satisfy a series of ordinary differential equations pasted together and that describe the asymptotic path followed by our algorithm. We rely on the implicit function theorem to show existence and uniqueness of these fluid ode's and to show that the proposed algorithm remains close to the ode solution.

Paper Structure (40 sections, 40 theorems, 374 equations, 14 figures, 1 table, 2 algorithms)

This paper contains 40 sections, 40 theorems, 374 equations, 14 figures, 1 table, 2 algorithms.

Introduction
Problem description and lower bound
Anchored Top-2 (AT2) Algorithm
Theoretical guarantees
Fluid dynamics
Convergence of algorithmic allocations to the optimal proportions
Numerical results
Conclusion
Outline
Single parameter exponential family of distributions
Enveloping the anchor and index functions under noisy estimates of the rewards
Framework for applying the Implicit function theorem (IFT)
Proofs from Section \ref{['sec:setup_lb']}
Single variable formulation of the lower bound problem and intuition behind the anchor function
Sub-optimality of TCB(I)
...and 25 more sections

Key Result

Proposition 2.1

For every positive $N$ satisfying $N\geq N_{\min}$, there is a unique set of variables $\mathbold{N}_{\overline{B}}(N)=(N_a(N):a\in \overline{B})$ and $I_{B}(N)$ satisfying the following conditions Furthermore, $\mathbold{N}_{\overline{B}}(\cdot)$ and $I_B(\cdot)$ are continuously differentiable w.r.t. $N$ for $N>N_{\min}$.

Figures (14)

Figure 1: Normalised index on $1$ sample path.
Figure 2: Sample complexity comparison.
Figure 3: Illustrative plot of $\mathbf{O}_2$'s objective $f(N_1)$
Figure 4: Anchor function value for easy Gaussian bandit (Exp.1), averaged over 4,000 sample paths.
Figure 5: Anchor function value for easy Bernoulli bandit (Exp.1), averaged over 4,000 sample paths.
...and 9 more figures

Theorems & Definitions (78)

Proposition 2.1
Proposition 2.2
Remark 2.1
Proposition 3.1: Convergence to optimal proportions
Theorem 3.1: Asymptotic optimality of AT2 and IAT2
Theorem 4.1: Fluid ODEs
Remark 4.1: Incorporating the stopping rule into the fluid dynamics
Remark 4.2: $\beta$-fluid dynamics
Proposition 5.1
Lemma 5.1
...and 68 more

Optimal Top-Two Method for Best Arm Identification and Fluid Analysis

TL;DR

Abstract

Optimal Top-Two Method for Best Arm Identification and Fluid Analysis

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (78)