Generalized Neyman Allocation for Locally Minimax Optimal Best-Arm Identification

Masahiro Kato

Generalized Neyman Allocation for Locally Minimax Optimal Best-Arm Identification

Masahiro Kato

TL;DR

This work tackles fixed-budget best-arm identification among $K$ arms with budget $T$, introducing the Generalized Neyman Allocation (GNA) to extend the classical Neyman allocation to multi-armed bandits. The authors formulate a distribution class under a small-gap regime and derive a worst-case lower bound on misidentification probability, then construct the GNA-A2IPW algorithm that achieves a matching upper bound, establishing local asymptotic minimax optimality. The method combines a minimax-optimal allocation rule with an Adaptive Augmented Inverse Probability Weighting estimator for arm means, and provides a Bernoulli-specific closed-form allocation in the small-gap limit. Simulation results corroborate the theoretical findings, showing that GNA attains strong performance relative to baselines and frequently approaches the oracle GJ benchmark in large samples, while remaining simple to implement in practice.

Abstract

This study investigates an asymptotically locally minimax optimal algorithm for fixed-budget best-arm identification (BAI). We propose the Generalized Neyman Allocation (GNA) algorithm and demonstrate that its worst-case upper bound on the probability of misidentifying the best arm aligns with the worst-case lower bound under the small-gap regime, where the gap between the expected outcomes of the best and suboptimal arms is small. Our lower and upper bounds are tight, matching exactly including constant terms within the small-gap regime. The GNA algorithm generalizes the Neyman allocation for two-armed bandits (Neyman, 1934; Kaufmann et al., 2016) and refines existing BAI algorithms, such as those proposed by Glynn & Juneja (2004). By proposing an asymptotically minimax optimal algorithm, we address the longstanding open issue in BAI (Kaufmann, 2020) and treatment choice (Kasy & Sautmann, 202) by restricting a class of distributions to the small-gap regimes.

Generalized Neyman Allocation for Locally Minimax Optimal Best-Arm Identification

TL;DR

This work tackles fixed-budget best-arm identification among

arms with budget

, introducing the Generalized Neyman Allocation (GNA) to extend the classical Neyman allocation to multi-armed bandits. The authors formulate a distribution class under a small-gap regime and derive a worst-case lower bound on misidentification probability, then construct the GNA-A2IPW algorithm that achieves a matching upper bound, establishing local asymptotic minimax optimality. The method combines a minimax-optimal allocation rule with an Adaptive Augmented Inverse Probability Weighting estimator for arm means, and provides a Bernoulli-specific closed-form allocation in the small-gap limit. Simulation results corroborate the theoretical findings, showing that GNA attains strong performance relative to baselines and frequently approaches the oracle GJ benchmark in large samples, while remaining simple to implement in practice.

Abstract

Paper Structure (25 sections, 11 theorems, 84 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 25 sections, 11 theorems, 84 equations, 3 figures, 3 tables, 1 algorithm.

Introduction
Problem setting
Background and Related Work
Contributions of This Study
Worst-case Lower Bound
Distribution Class
Algorithm class
Worst-case Lower Bound
The GNA-A2IPW algorithm
Allocation Rule: the Generalized Neyman Allocation
Estimation Rule
Worst-case Upper Bound and Local Asymptotic Minimax Optimality
Worst-case Upper Bound
Local Minimax Optimality
Intuitive Explanation
...and 10 more sections

Key Result

Theorem 2.3

Fix $\bm{\sigma}$, $\Theta$, and $\mathcal{Y}$ in Definition def:mean_param. Given $\mathcal{P}(\underline{\Delta}, \overline{\Delta}) = \mathcal{P}(\underline{\Delta}, \overline{\Delta}, \bm{\sigma}, \Theta, \mathcal{Y})$, any consistent algorithm $\pi\in\Pi^{\mathrm{const}}$ (Definition def:consis where

Figures (3)

Figure 1: Illustration of local asymptotic minimax optimality. The $y$-axis represents the probability of misidentification, while the $x$-axis represents the gap $\Delta = \overline{\Delta} = \underline{\Delta}$ (for simplicity, we set $\Delta = \overline{\Delta} = \underline{\Delta}$). The upper (red) region represents the upper bound, and the lower (blue) region represents the lower bound, which converges as $\Delta \to 0$ (small-gap regime).
Figure 2: The results with $\mu_1 = 1.00$, $\mu_2 = 0.90$, $\mu_a \sim \mathrm{Uniform}[0.90, 0.95]$ for all $a\in[K]\backslash\{1, 2\}$, and $\overline{\sigma} = 3$ for $K = 3$ (Upper graph) and $K = 5$ (Lower graph). We report the empirical probability of misidentification at $T \in \{100, 200, 300, \dots, 49900, 50000\}$.
Figure 4: The results with $\mu_1 = 1.00$$\mu_a = 0.95$ for all $a\in [K]\backslash\{1\}$, and $\overline{\sigma} = 3$ for $K = 3$ (Upper graph) and $K = 5$ (Lower graph). We report the empirical probability of misidentification at $T \in \{100, 200, 300, \dots, 49900, 50000\}$.

Theorems & Definitions (18)

Definition 2.1: Mean-parameterized distributions with finite variances
Definition 2.2: Consistent algorithms
Theorem 2.3: Worst-case Lower Bound
Lemma 4.1: Upper bound of the GNA algorithm
Theorem 4.2: Worst-case upper bound of the GNA algorithm
Theorem 4.3: Local asymptotic minimax optimality
Proposition A.1: Transportation lemma. From Lemma 1 in Kaufman2016complexity
Proposition A.2: Proposition 15.3.2. in Duchi2023 and Theorem 4.4.4 in calin2014geometric
proof : Proof of Theorem \ref{['thm:lower_bound']}
Lemma B.1: Probability of misidentification of the A2IPW estimator
...and 8 more

Generalized Neyman Allocation for Locally Minimax Optimal Best-Arm Identification

TL;DR

Abstract

Generalized Neyman Allocation for Locally Minimax Optimal Best-Arm Identification

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (18)