Table of Contents
Fetching ...

Box Thirding: Anytime Best Arm Identification under Insufficient Sampling

Seohwa Hwang, Junyong Park

TL;DR

The paper tackles Best Arm Identification under unknown or fixed budgets, focusing on the data-poor regime where not all arms can be evaluated. It introduces Box Thirding (B3), a fully anytime algorithm that builds a hierarchical box structure and uses iterative ternary comparisons to promote strong candidates, defer uncertain ones, and discard weak arms, while reusing past samples to refine decisions. The authors derive a decomposition of the misidentification probability into non-inclusion and within-set misidentification, establishing sharp, data-poor–condition–aware bounds, and show that B3 matches or improves upon existing anytime BAI methods by maximizing screening capacity and achieving fast error decay. Empirical results on the NYCCC dataset demonstrate B3’s robust performance across high, moderate, and deterministic noise regimes, highlighting its practical value for large-scale, budget-constrained BAI problems. The work provides theoretical and empirical support for an algorithm that balances screening and discrimination without budget knowledge, with potential extensions to sample reuse and non-data-poor scenarios that could further narrow the gap to fixed-budget methods.

Abstract

We introduce Box Thirding (B3), a flexible and efficient algorithm for Best Arm Identification (BAI) under fixed-budget constraints. It is designed for both anytime BAI and scenarios with large N, where the number of arms is too large for exhaustive evaluation within a limited budget T. The algorithm employs an iterative ternary comparison: in each iteration, three arms are compared--the best-performing arm is explored further, the median is deferred for future comparisons, and the weakest is discarded. Even without prior knowledge of T, B3 achieves an epsilon-best arm misidentification probability comparable to Successive Halving (SH), which requires T as a predefined parameter, applied to a randomly selected subset of c0 arms that fit within the budget. Empirical results show that B3 outperforms existing methods under limited-budget constraints in terms of simple regret, as demonstrated on the New Yorker Cartoon Caption Contest dataset.

Box Thirding: Anytime Best Arm Identification under Insufficient Sampling

TL;DR

The paper tackles Best Arm Identification under unknown or fixed budgets, focusing on the data-poor regime where not all arms can be evaluated. It introduces Box Thirding (B3), a fully anytime algorithm that builds a hierarchical box structure and uses iterative ternary comparisons to promote strong candidates, defer uncertain ones, and discard weak arms, while reusing past samples to refine decisions. The authors derive a decomposition of the misidentification probability into non-inclusion and within-set misidentification, establishing sharp, data-poor–condition–aware bounds, and show that B3 matches or improves upon existing anytime BAI methods by maximizing screening capacity and achieving fast error decay. Empirical results on the NYCCC dataset demonstrate B3’s robust performance across high, moderate, and deterministic noise regimes, highlighting its practical value for large-scale, budget-constrained BAI problems. The work provides theoretical and empirical support for an algorithm that balances screening and discrimination without budget knowledge, with potential extensions to sample reuse and non-data-poor scenarios that could further narrow the gap to fixed-budget methods.

Abstract

We introduce Box Thirding (B3), a flexible and efficient algorithm for Best Arm Identification (BAI) under fixed-budget constraints. It is designed for both anytime BAI and scenarios with large N, where the number of arms is too large for exhaustive evaluation within a limited budget T. The algorithm employs an iterative ternary comparison: in each iteration, three arms are compared--the best-performing arm is explored further, the median is deferred for future comparisons, and the weakest is discarded. Even without prior knowledge of T, B3 achieves an epsilon-best arm misidentification probability comparable to Successive Halving (SH), which requires T as a predefined parameter, applied to a randomly selected subset of c0 arms that fit within the budget. Empirical results show that B3 outperforms existing methods under limited-budget constraints in terms of simple regret, as demonstrated on the New Yorker Cartoon Caption Contest dataset.
Paper Structure (45 sections, 21 theorems, 127 equations, 5 figures, 2 tables, 5 algorithms)

This paper contains 45 sections, 21 theorems, 127 equations, 5 figures, 2 tables, 5 algorithms.

Key Result

Theorem 4.3

Under the data-poor condition for $\epsilon$, the B3 algorithm satisfies the following upper bound: where $N_{\epsilon/2}$ denotes the number of $\epsilon/2$-best arms.

Figures (5)

  • Figure 1: Toy example of remedian estimation: partition the data into three blocks and take their within-block medians $(2.8,\,5.3,\,4.8)$; taking the median of these medians yields $4.8$. Repeating this hierarchical “median-of-medians” construction produces an estimator that converges in probability to the population median.
  • Figure 2: Illustration of ARRANGE_BOX($l,j;D$) when $\hat{\mu}_{i_1} > \hat{\mu}_{i_2} > \hat{\mu}_{i_3}$. The DISCARD operation is omitted for clarity.
  • Figure 3: Fraction of arms that are lifted, shifted, and discarded at a fixed level $l$.
  • Figure 4: Simulation results on the NYCCC 893 dataset under three reward noise regimes. Curves indicate mean performance and shaded regions denote the 25%--75% quantile range.
  • Figure 5: Simulation results on the NYCCC 893 dataset under different reward distributions. Curves indicate the mean performance, and shaded regions correspond to the 25%--75% quantile range.

Theorems & Definitions (40)

  • Definition 4.1: Candidate Set $C$
  • Definition 4.2: Data-Poor Condition
  • Theorem 4.3
  • Corollary 4.4: Simple Regret
  • Corollary 4.5: ($\epsilon, \delta$)-Sample Complexity
  • Remark 4.6
  • Theorem 4.7
  • Proposition 4.8
  • Corollary 4.9
  • Theorem 4.10
  • ...and 30 more