Breaking the $\log(1/Δ_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids
Tianyuan Jin, Qin Zhang, Dongruo Zhou
TL;DR
The paper addresses batched best-arm identification in multi-armed and linear bandits under adaptive batching, introducing instance-sensitive measures to surpass the traditional log(1/Δ_2) batch barrier. It defines the instance-dependent batch complexity $R_I$ and presents IS-SE for MAB, achieving near-optimal sample complexity $\tilde{O}(H_I)$ with batch complexity $O(R_I)$, and extends the framework to linear bandits via IS-RAGE using $\psi^*$ and $\rho$ to capture structure. Theoretical guarantees show correctness with high probability and explicit bounds on batch and sample complexities, often improving over fixed-grid baselines. Experiments on synthetic data and Movielens validate substantial batch-efficiency gains, highlighting practical benefits for batch-constrained online learning settings.
Abstract
We investigate the problem of batched best arm identification in multi-armed bandits, where we aim to identify the best arm from a set of $n$ arms while minimizing both the number of samples and batches. We introduce an algorithm that achieves near-optimal sample complexity and features an instance-sensitive batch complexity, which breaks the $\log(1/Δ_2)$ barrier. The main contribution of our algorithm is a novel sample allocation scheme that effectively balances exploration and exploitation for batch sizes. Experimental results indicate that our approach is more batch-efficient across various setups. We also extend this framework to the problem of batched best arm identification in linear bandits and achieve similar improvements.
