Generalized Neyman Allocation for Locally Minimax Optimal Best-Arm Identification
Masahiro Kato
TL;DR
This work tackles fixed-budget best-arm identification among $K$ arms with budget $T$, introducing the Generalized Neyman Allocation (GNA) to extend the classical Neyman allocation to multi-armed bandits. The authors formulate a distribution class under a small-gap regime and derive a worst-case lower bound on misidentification probability, then construct the GNA-A2IPW algorithm that achieves a matching upper bound, establishing local asymptotic minimax optimality. The method combines a minimax-optimal allocation rule with an Adaptive Augmented Inverse Probability Weighting estimator for arm means, and provides a Bernoulli-specific closed-form allocation in the small-gap limit. Simulation results corroborate the theoretical findings, showing that GNA attains strong performance relative to baselines and frequently approaches the oracle GJ benchmark in large samples, while remaining simple to implement in practice.
Abstract
This study investigates an asymptotically locally minimax optimal algorithm for fixed-budget best-arm identification (BAI). We propose the Generalized Neyman Allocation (GNA) algorithm and demonstrate that its worst-case upper bound on the probability of misidentifying the best arm aligns with the worst-case lower bound under the small-gap regime, where the gap between the expected outcomes of the best and suboptimal arms is small. Our lower and upper bounds are tight, matching exactly including constant terms within the small-gap regime. The GNA algorithm generalizes the Neyman allocation for two-armed bandits (Neyman, 1934; Kaufmann et al., 2016) and refines existing BAI algorithms, such as those proposed by Glynn & Juneja (2004). By proposing an asymptotically minimax optimal algorithm, we address the longstanding open issue in BAI (Kaufmann, 2020) and treatment choice (Kasy & Sautmann, 202) by restricting a class of distributions to the small-gap regimes.
