Table of Contents
Fetching ...

Breaking the $\log(1/Δ_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids

Tianyuan Jin, Qin Zhang, Dongruo Zhou

TL;DR

The paper addresses batched best-arm identification in multi-armed and linear bandits under adaptive batching, introducing instance-sensitive measures to surpass the traditional log(1/Δ_2) batch barrier. It defines the instance-dependent batch complexity $R_I$ and presents IS-SE for MAB, achieving near-optimal sample complexity $\tilde{O}(H_I)$ with batch complexity $O(R_I)$, and extends the framework to linear bandits via IS-RAGE using $\psi^*$ and $\rho$ to capture structure. Theoretical guarantees show correctness with high probability and explicit bounds on batch and sample complexities, often improving over fixed-grid baselines. Experiments on synthetic data and Movielens validate substantial batch-efficiency gains, highlighting practical benefits for batch-constrained online learning settings.

Abstract

We investigate the problem of batched best arm identification in multi-armed bandits, where we aim to identify the best arm from a set of $n$ arms while minimizing both the number of samples and batches. We introduce an algorithm that achieves near-optimal sample complexity and features an instance-sensitive batch complexity, which breaks the $\log(1/Δ_2)$ barrier. The main contribution of our algorithm is a novel sample allocation scheme that effectively balances exploration and exploitation for batch sizes. Experimental results indicate that our approach is more batch-efficient across various setups. We also extend this framework to the problem of batched best arm identification in linear bandits and achieve similar improvements.

Breaking the $\log(1/Δ_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids

TL;DR

The paper addresses batched best-arm identification in multi-armed and linear bandits under adaptive batching, introducing instance-sensitive measures to surpass the traditional log(1/Δ_2) batch barrier. It defines the instance-dependent batch complexity and presents IS-SE for MAB, achieving near-optimal sample complexity with batch complexity , and extends the framework to linear bandits via IS-RAGE using and to capture structure. Theoretical guarantees show correctness with high probability and explicit bounds on batch and sample complexities, often improving over fixed-grid baselines. Experiments on synthetic data and Movielens validate substantial batch-efficiency gains, highlighting practical benefits for batch-constrained online learning settings.

Abstract

We investigate the problem of batched best arm identification in multi-armed bandits, where we aim to identify the best arm from a set of arms while minimizing both the number of samples and batches. We introduce an algorithm that achieves near-optimal sample complexity and features an instance-sensitive batch complexity, which breaks the barrier. The main contribution of our algorithm is a novel sample allocation scheme that effectively balances exploration and exploitation for batch sizes. Experimental results indicate that our approach is more batch-efficient across various setups. We also extend this framework to the problem of batched best arm identification in linear bandits and achieve similar improvements.

Paper Structure

This paper contains 23 sections, 7 theorems, 63 equations, 3 figures, 2 algorithms.

Key Result

Theorem 3.1

Select $\beta_{\text{conf}} = 5\sqrt{2}$, $\beta_{\text{sample}} = 25/9$ and $\beta_{\text{grid}} = 4$. With probability $1-\delta$, Algorithm alg:mots-1 satisfies the following conditions:

Figures (3)

  • Figure 1: Visualization of the three examples. The detailed description of three examples are listed from \ref{['example:1']} to \ref{['example:3']}. Each suboptimal arm is represented by a disk (with slight shifts to avoid overlapping), while the best arm is represented by a square. $L^{\tt SE}_i$ represents the average number of arm pulls by successive elimination, and $\bar{L}_i$ represents the average number of arm pulls by $\text{IS-SE}$. $U_i$ represents the set of eliminated arm in the $i$-th batch. In all three examples, successive elimination needs $\Theta(\log n)$ batches. However, for the first two examples, $\text{IS-SE}$ only needs $O(1)$ batches. For the third example, $\text{IS-SE}$ shares the same batch complexity as successive elimination.
  • Figure 2: Sample complexity v.s. number of batches.
  • Figure 3: SE v.s. IS-SE on Movielens25M

Theorems & Definitions (17)

  • Definition 1.1: Instance-sensitive batch complexity of MAB
  • Theorem 3.1
  • Remark 3.2
  • Remark 3.3
  • Remark 3.4
  • Definition 4.1
  • Definition 4.2: allen2021nearfiez2019sequential
  • Definition 4.3: soare2014bestfiez2019sequential
  • Lemma 4.4
  • Remark 4.5
  • ...and 7 more