Box Thirding: Anytime Best Arm Identification under Insufficient Sampling
Seohwa Hwang, Junyong Park
TL;DR
The paper tackles Best Arm Identification under unknown or fixed budgets, focusing on the data-poor regime where not all arms can be evaluated. It introduces Box Thirding (B3), a fully anytime algorithm that builds a hierarchical box structure and uses iterative ternary comparisons to promote strong candidates, defer uncertain ones, and discard weak arms, while reusing past samples to refine decisions. The authors derive a decomposition of the misidentification probability into non-inclusion and within-set misidentification, establishing sharp, data-poor–condition–aware bounds, and show that B3 matches or improves upon existing anytime BAI methods by maximizing screening capacity and achieving fast error decay. Empirical results on the NYCCC dataset demonstrate B3’s robust performance across high, moderate, and deterministic noise regimes, highlighting its practical value for large-scale, budget-constrained BAI problems. The work provides theoretical and empirical support for an algorithm that balances screening and discrimination without budget knowledge, with potential extensions to sample reuse and non-data-poor scenarios that could further narrow the gap to fixed-budget methods.
Abstract
We introduce Box Thirding (B3), a flexible and efficient algorithm for Best Arm Identification (BAI) under fixed-budget constraints. It is designed for both anytime BAI and scenarios with large N, where the number of arms is too large for exhaustive evaluation within a limited budget T. The algorithm employs an iterative ternary comparison: in each iteration, three arms are compared--the best-performing arm is explored further, the median is deferred for future comparisons, and the weakest is discarded. Even without prior knowledge of T, B3 achieves an epsilon-best arm misidentification probability comparable to Successive Halving (SH), which requires T as a predefined parameter, applied to a randomly selected subset of c0 arms that fit within the budget. Empirical results show that B3 outperforms existing methods under limited-budget constraints in terms of simple regret, as demonstrated on the New Yorker Cartoon Caption Contest dataset.
