Batched Kernelized Bandits: Refinements and Extensions

Chenkai Ma; Keqin Chen; Jonathan Scarlett

Batched Kernelized Bandits: Refinements and Extensions

Chenkai Ma, Keqin Chen, Jonathan Scarlett

Abstract

In this paper, we consider the problem of black-box optimization with noisy feedback revealed in batches, where the unknown function to optimize has a bounded norm in some Reproducing Kernel Hilbert Space (RKHS). We refer to this as the Batched Kernelized Bandits problem, and refine and extend existing results on regret bounds. For algorithmic upper bounds, (Li and Scarlett, 2022) shows that $B=O(\log\log T)$ batches suffice to attain near-optimal regret, where $T$ is the time horizon and $B$ is the number of batches. We further refine this by (i) finding the optimal number of batches including constant factors (to within $1+o(1)$), and (ii) removing a factor of $B$ in the regret bound. For algorithm-independent lower bounds, noticing that existing results only apply when the batch sizes are fixed in advance, we present novel lower bounds when the batch sizes are chosen adaptively, and show that adaptive batches have essentially same minimax regret scaling as fixed batches. Furthermore, we consider a robust setting where the goal is to choose points for which the function value remains high even after an adversarial perturbation. We present the robust-BPE algorithm, and show that a suitably-defined cumulative regret notion incurs the same bound as the non-robust setting, and derive a simple regret bound significantly below that of previous work.

Batched Kernelized Bandits: Refinements and Extensions

Abstract

batches suffice to attain near-optimal regret, where

is the time horizon and

is the number of batches. We further refine this by (i) finding the optimal number of batches including constant factors (to within

), and (ii) removing a factor of

in the regret bound. For algorithm-independent lower bounds, noticing that existing results only apply when the batch sizes are fixed in advance, we present novel lower bounds when the batch sizes are chosen adaptively, and show that adaptive batches have essentially same minimax regret scaling as fixed batches. Furthermore, we consider a robust setting where the goal is to choose points for which the function value remains high even after an adversarial perturbation. We present the robust-BPE algorithm, and show that a suitably-defined cumulative regret notion incurs the same bound as the non-robust setting, and derive a simple regret bound significantly below that of previous work.

Paper Structure (31 sections, 24 theorems, 96 equations, 2 figures, 3 tables, 2 algorithms)

This paper contains 31 sections, 24 theorems, 96 equations, 2 figures, 3 tables, 2 algorithms.

Introduction
Related Work
Problem Setup
Standard Setting
Robust Setting
Standard Setting: Upper Bounds
Preliminaries and Existing Results
Refinements
Standard Setting: Lower Bounds
Existing Results for Fixed Batches
Extension to Adaptive Batches
Robust Setting
The Robust-BPE Algorithm
Regret Analysis for Robust-BPE
Conclusion
...and 16 more sections

Key Result

Lemma 1

Li22 Under the setup of Section sec:setup_standard and using the batch sizes defined in eq:grow_b_batch_sizes_li, for any $\delta \in (0,1)$, the BPE algorithm yields with probability at least $1-\delta$ that where $\Lambda=\Psi+\sqrt{\log(|\mathcal{X}|/\delta)}$.

Figures (2)

Figure 1: Illustration of a class of hard-to-distinguish functions $\mathcal{F}$, where any $x\in\mathcal{X}$ can be $\epsilon$-optimal for at most one bump function. This is an "idealized" illustration, with the actual functions used having infinite support but steady decay to zero.
Figure 2: Synthetic functions sampled using different kernels.

Theorems & Definitions (51)

Lemma 1
Remark 1
Theorem 1
Remark 2
Remark 3
Remark 4
Lemma 2
Theorem 2
Remark 5
Remark 6
...and 41 more

Batched Kernelized Bandits: Refinements and Extensions

Abstract

Batched Kernelized Bandits: Refinements and Extensions

Authors

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (51)