Batched Stochastic Bandit for Nondegenerate Functions

Yu Liu; Yunlu Shu; Tianyu Wang

Batched Stochastic Bandit for Nondegenerate Functions

Yu Liu, Yunlu Shu, Tianyu Wang

TL;DR

This work studies batched stochastic bandits for the broad class of nondegenerate functions on compact doubling metric spaces. It introduces the Geometric Narrowing (GN) algorithm, which progressively narrows the search region and achieves near-optimal regret with only $\mathcal{O}(\log \log T)$ communication rounds, plus a matching information-theoretic lower bound. The analysis leverages a Rounded Radius (RR) sequence to support both adaptive and static batching, and employs a bitten-apple construction to establish exponential-in-dimension lower bounds under communication constraints. The results bridge stochastic zeroth-order optimization and Riemannian settings, providing both practical batched-bandit strategies and fundamental limits for high-dimensional nondegenerate landscapes.

Abstract

This paper studies batched bandit learning problems for nondegenerate functions. We introduce an algorithm that solves the batched bandit problem for nondegenerate functions near-optimally. More specifically, we introduce an algorithm, called Geometric Narrowing (GN), whose regret bound is of order $\widetilde{\mathcal{O}} ( A_{+}^d \sqrt{T} )$. In addition, GN only needs $\mathcal{O} (\log \log T)$ batches to achieve this regret. We also provide lower bound analysis for this problem. More specifically, we prove that over some (compact) doubling metric space of doubling dimension $d$: 1. For any policy $π$, there exists a problem instance on which $π$ admits a regret of order $Ω ( A_-^d \sqrt{T})$; 2. No policy can achieve a regret of order $ A_-^d \sqrt{T} $ over all problem instances, using less than $ Ω( \log \log T ) $ rounds of communications. Our lower bound analysis shows that the GN algorithm achieves near optimal regret with minimal number of batches.

Batched Stochastic Bandit for Nondegenerate Functions

TL;DR

communication rounds, plus a matching information-theoretic lower bound. The analysis leverages a Rounded Radius (RR) sequence to support both adaptive and static batching, and employs a bitten-apple construction to establish exponential-in-dimension lower bounds under communication constraints. The results bridge stochastic zeroth-order optimization and Riemannian settings, providing both practical batched-bandit strategies and fundamental limits for high-dimensional nondegenerate landscapes.

Abstract

. In addition, GN only needs

batches to achieve this regret. We also provide lower bound analysis for this problem. More specifically, we prove that over some (compact) doubling metric space of doubling dimension

: 1. For any policy

, there exists a problem instance on which

admits a regret of order

; 2. No policy can achieve a regret of order

over all problem instances, using less than

rounds of communications. Our lower bound analysis shows that the GN algorithm achieves near optimal regret with minimal number of batches.

Paper Structure (19 sections, 19 theorems, 116 equations, 10 figures, 1 algorithm)

This paper contains 19 sections, 19 theorems, 116 equations, 10 figures, 1 algorithm.

Introduction
Nondegenerate Functions
The Batched Bandit Setting
Our Results
Implications of Our Results
Challenges and Our Approach
Related Works
Additional related works from stochastic zeroth-order Riemannian optimization
Preliminaries
The Geometric Narrowing Algorithm
Analysis of the GN Algorithm
Lower Bound Analysis
The instances
The information-theoretical argument
Lower bound for nondegenerate bandits without communication constraints
...and 4 more sections

Key Result

Theorem 1

Let $$X , D $$ be a compact doubling metric space, and let $f$ be a nondegenerate function defined over $( \mathcal{X} ,\mathcal{D})$. Consider a stochastic bandit learning environment where all loss samples are corrupted by $iid$ sub-Gaussian mean-zero noise. For any $T \in \mathbb{N}_+$, with prob where $d$ is the doubling dimension of $( \mathcal{X} , \mathcal{D})$, and $K_+$ and $A_+$ are cons

Figures (10)

Figure 1: Plot of $f( \mathbf{x} )$ defined in (\ref{['eq:exp']}). $\frac{ \mathbf{x} ^2}{2}$ (resp. $2 \mathbf{x} ^2$) is a lower bound (resp. upper bound) for $f ( \mathbf{x} )$ over $[-2,2]$. This plot shows that a nondegenerate function can be nonconvex, nonsmooth or discountinuous.
Figure 2: Plot of a nondegenerate function $f$ defined over the unit circle $\mathbb{S}^1$, and the metric is the arc length along the circle. This function is not convex and not continuous, but satisfies the nondegenerate condition.
Figure 3: Explanation of the instance for nondegenerate bandits
Figure 4: Illustration of the execution procedure of the GN algorithm over an interval. The function values at $\mathbf{x} _1$ and $\mathbf{x} _2$ jointly narrow down the range of $\mathbf{x} ^*$. To ensure the function values at $\mathbf{x} _1$ and $\mathbf{x} _2$ fall between the upper and lower bounds for the nondegenerate function, the minimum of the function has to reside in a certain range. In this figure, the solid lines show a pair of legitimate bound, implying that the underlying functions may take its minimum at $\mathbf{z} _1$; the dashed lines show a pair of legitimate bound, implying that the underlying functions cannot take its minimum at $\mathbf{z} _2$, neither in a neighborhood of $\mathbf{z} _2$.
Figure 5: An example run of the GN algorithm. The surface shows the expected loss function, and the scattered points are loss samples over the current domain. These two plots describe the delete and split operations between adjacent batches of a GN run.
...and 5 more figures

Theorems & Definitions (48)

Remark 1
Theorem 1
Corollary 1
Theorem 2
Theorem 3
Corollary 2
Remark 2: Curse of dimensionality
Definition 1: Doubling metric space
Proposition 1
Definition 2: Nondegenerate functions
...and 38 more

Batched Stochastic Bandit for Nondegenerate Functions

TL;DR

Abstract

Batched Stochastic Bandit for Nondegenerate Functions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (48)