Table of Contents
Fetching ...

Generalized Neyman Allocation for Locally Minimax Optimal Best-Arm Identification

Masahiro Kato

TL;DR

This work tackles fixed-budget best-arm identification among $K$ arms with budget $T$, introducing the Generalized Neyman Allocation (GNA) to extend the classical Neyman allocation to multi-armed bandits. The authors formulate a distribution class under a small-gap regime and derive a worst-case lower bound on misidentification probability, then construct the GNA-A2IPW algorithm that achieves a matching upper bound, establishing local asymptotic minimax optimality. The method combines a minimax-optimal allocation rule with an Adaptive Augmented Inverse Probability Weighting estimator for arm means, and provides a Bernoulli-specific closed-form allocation in the small-gap limit. Simulation results corroborate the theoretical findings, showing that GNA attains strong performance relative to baselines and frequently approaches the oracle GJ benchmark in large samples, while remaining simple to implement in practice.

Abstract

This study investigates an asymptotically locally minimax optimal algorithm for fixed-budget best-arm identification (BAI). We propose the Generalized Neyman Allocation (GNA) algorithm and demonstrate that its worst-case upper bound on the probability of misidentifying the best arm aligns with the worst-case lower bound under the small-gap regime, where the gap between the expected outcomes of the best and suboptimal arms is small. Our lower and upper bounds are tight, matching exactly including constant terms within the small-gap regime. The GNA algorithm generalizes the Neyman allocation for two-armed bandits (Neyman, 1934; Kaufmann et al., 2016) and refines existing BAI algorithms, such as those proposed by Glynn & Juneja (2004). By proposing an asymptotically minimax optimal algorithm, we address the longstanding open issue in BAI (Kaufmann, 2020) and treatment choice (Kasy & Sautmann, 202) by restricting a class of distributions to the small-gap regimes.

Generalized Neyman Allocation for Locally Minimax Optimal Best-Arm Identification

TL;DR

This work tackles fixed-budget best-arm identification among arms with budget , introducing the Generalized Neyman Allocation (GNA) to extend the classical Neyman allocation to multi-armed bandits. The authors formulate a distribution class under a small-gap regime and derive a worst-case lower bound on misidentification probability, then construct the GNA-A2IPW algorithm that achieves a matching upper bound, establishing local asymptotic minimax optimality. The method combines a minimax-optimal allocation rule with an Adaptive Augmented Inverse Probability Weighting estimator for arm means, and provides a Bernoulli-specific closed-form allocation in the small-gap limit. Simulation results corroborate the theoretical findings, showing that GNA attains strong performance relative to baselines and frequently approaches the oracle GJ benchmark in large samples, while remaining simple to implement in practice.

Abstract

This study investigates an asymptotically locally minimax optimal algorithm for fixed-budget best-arm identification (BAI). We propose the Generalized Neyman Allocation (GNA) algorithm and demonstrate that its worst-case upper bound on the probability of misidentifying the best arm aligns with the worst-case lower bound under the small-gap regime, where the gap between the expected outcomes of the best and suboptimal arms is small. Our lower and upper bounds are tight, matching exactly including constant terms within the small-gap regime. The GNA algorithm generalizes the Neyman allocation for two-armed bandits (Neyman, 1934; Kaufmann et al., 2016) and refines existing BAI algorithms, such as those proposed by Glynn & Juneja (2004). By proposing an asymptotically minimax optimal algorithm, we address the longstanding open issue in BAI (Kaufmann, 2020) and treatment choice (Kasy & Sautmann, 202) by restricting a class of distributions to the small-gap regimes.
Paper Structure (25 sections, 11 theorems, 84 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 25 sections, 11 theorems, 84 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Theorem 2.3

Fix $\bm{\sigma}$, $\Theta$, and $\mathcal{Y}$ in Definition def:mean_param. Given $\mathcal{P}(\underline{\Delta}, \overline{\Delta}) = \mathcal{P}(\underline{\Delta}, \overline{\Delta}, \bm{\sigma}, \Theta, \mathcal{Y})$, any consistent algorithm $\pi\in\Pi^{\mathrm{const}}$ (Definition def:consis where

Figures (3)

  • Figure 1: Illustration of local asymptotic minimax optimality. The $y$-axis represents the probability of misidentification, while the $x$-axis represents the gap $\Delta = \overline{\Delta} = \underline{\Delta}$ (for simplicity, we set $\Delta = \overline{\Delta} = \underline{\Delta}$). The upper (red) region represents the upper bound, and the lower (blue) region represents the lower bound, which converges as $\Delta \to 0$ (small-gap regime).
  • Figure 2: The results with $\mu_1 = 1.00$, $\mu_2 = 0.90$, $\mu_a \sim \mathrm{Uniform}[0.90, 0.95]$ for all $a\in[K]\backslash\{1, 2\}$, and $\overline{\sigma} = 3$ for $K = 3$ (Upper graph) and $K = 5$ (Lower graph). We report the empirical probability of misidentification at $T \in \{100, 200, 300, \dots, 49900, 50000\}$.
  • Figure 4: The results with $\mu_1 = 1.00$$\mu_a = 0.95$ for all $a\in [K]\backslash\{1\}$, and $\overline{\sigma} = 3$ for $K = 3$ (Upper graph) and $K = 5$ (Lower graph). We report the empirical probability of misidentification at $T \in \{100, 200, 300, \dots, 49900, 50000\}$.

Theorems & Definitions (18)

  • Definition 2.1: Mean-parameterized distributions with finite variances
  • Definition 2.2: Consistent algorithms
  • Theorem 2.3: Worst-case Lower Bound
  • Lemma 4.1: Upper bound of the GNA algorithm
  • Theorem 4.2: Worst-case upper bound of the GNA algorithm
  • Theorem 4.3: Local asymptotic minimax optimality
  • Proposition A.1: Transportation lemma. From Lemma 1 in Kaufman2016complexity
  • Proposition A.2: Proposition 15.3.2. in Duchi2023 and Theorem 4.4.4 in calin2014geometric
  • proof : Proof of Theorem \ref{['thm:lower_bound']}
  • Lemma B.1: Probability of misidentification of the A2IPW estimator
  • ...and 8 more