Table of Contents
Fetching ...

Regret Bounds for Noise-Free Cascaded Kernelized Bandits

Zihan Li, Jonathan Scarlett

TL;DR

This work analyzes noise-free grey-box optimization of cascaded kernelized networks, introducing GPN-UCB, a structure-aware sequential UCB algorithm, and a non-adaptive grid-based sampling method across chain, multi-output chain, and feed-forward architectures. Regret is characterized using Σ_T, aggregating layer-wise posterior uncertainty, and the authors establish upper bounds that scale with the network length and Lipschitz constants, while matching algorithm-independent lower bounds in many regimes. They derive explicit bounds for the three network types, connect Σ_T and Σ^Γ_T to information-theoretic gains, and provide Matérn-kernel specific rates, including conjectured near-optimal regimes under a key conjecture. The results significantly improve dependencies relative to prior cascade methods and offer practical pathways toward noisy extensions, with experimental illustrations supporting the theoretical findings.

Abstract

We consider optimizing a function network in the noise-free grey-box setting with RKHS function classes, where the exact intermediate results are observable. We assume that the structure of the network is known (but not the underlying functions comprising it), and we study three types of structures: (1) chain: a cascade of scalar-valued functions, (2) multi-output chain: a cascade of vector-valued functions, and (3) feed-forward network: a fully connected feed-forward network of scalar-valued functions. We propose a sequential upper confidence bound based algorithm GPN-UCB along with a general theoretical upper bound on the cumulative regret. In addition, we propose a non-adaptive sampling based method along with its theoretical upper bound on the simple regret for the Matérn kernel. We also provide algorithm-independent lower bounds on the simple regret and cumulative regret. Our regret bounds for GPN-UCB have the same dependence on the time horizon as the best known in the vanilla black-box setting, as well as near-optimal dependencies on other parameters (e.g., RKHS norm and network length).

Regret Bounds for Noise-Free Cascaded Kernelized Bandits

TL;DR

This work analyzes noise-free grey-box optimization of cascaded kernelized networks, introducing GPN-UCB, a structure-aware sequential UCB algorithm, and a non-adaptive grid-based sampling method across chain, multi-output chain, and feed-forward architectures. Regret is characterized using Σ_T, aggregating layer-wise posterior uncertainty, and the authors establish upper bounds that scale with the network length and Lipschitz constants, while matching algorithm-independent lower bounds in many regimes. They derive explicit bounds for the three network types, connect Σ_T and Σ^Γ_T to information-theoretic gains, and provide Matérn-kernel specific rates, including conjectured near-optimal regimes under a key conjecture. The results significantly improve dependencies relative to prior cascade methods and offer practical pathways toward noisy extensions, with experimental illustrations supporting the theoretical findings.

Abstract

We consider optimizing a function network in the noise-free grey-box setting with RKHS function classes, where the exact intermediate results are observable. We assume that the structure of the network is known (but not the underlying functions comprising it), and we study three types of structures: (1) chain: a cascade of scalar-valued functions, (2) multi-output chain: a cascade of vector-valued functions, and (3) feed-forward network: a fully connected feed-forward network of scalar-valued functions. We propose a sequential upper confidence bound based algorithm GPN-UCB along with a general theoretical upper bound on the cumulative regret. In addition, we propose a non-adaptive sampling based method along with its theoretical upper bound on the simple regret for the Matérn kernel. We also provide algorithm-independent lower bounds on the simple regret and cumulative regret. Our regret bounds for GPN-UCB have the same dependence on the time horizon as the best known in the vanilla black-box setting, as well as near-optimal dependencies on other parameters (e.g., RKHS norm and network length).
Paper Structure (48 sections, 26 theorems, 149 equations, 13 figures, 2 tables, 2 algorithms)

This paper contains 48 sections, 26 theorems, 149 equations, 13 figures, 2 tables, 2 algorithms.

Key Result

Lemma 1

kanagawa2018gaussian For $f\in\mathcal{H}_k(B)$, let $\mu_t({\mathbf{x}})$ and $\sigma_t({\mathbf{x}})^2$ denote the posterior mean and variance based on $t$ points $({\mathbf{x}}_1,\dots,{\mathbf{x}}_t)$ and their noise-free observations $(y_1,\dots,y_t)$ using eq:mean and eq:var. Then, it holds fo

Figures (13)

  • Figure 1: A function network $g$ of $m$ layers with input ${\mathbf{x}}$ and output $y$.
  • Figure 2: A chain with $m=3$, where $\{f^{(i)}\}_{i=1}^3$ are scalar-valued functions.
  • Figure 5: Extended domain for multi-output chains.
  • Figure 7: Illustration of ${\mkern 1.25mu\overline{\mkern-1.25mug\mkern-0.25mu}\mkern 0.25mu}$ for multi-output chain.
  • Figure 8: Illustration of ${\mkern 1.25mu\overline{\mkern-1.25mug\mkern-0.25mu}\mkern 0.25mu}$ for feed-forward network.
  • ...and 8 more figures

Theorems & Definitions (46)

  • Lemma 1
  • Lemma 2
  • Theorem 1: GPN-UCB for chains
  • Theorem 2: GPN-UCB for multi-output chains
  • Remark 1
  • Theorem 3: GPN-UCB for feed-forward networks
  • Theorem 4: Non-adaptive sampling method for chains
  • Remark 2
  • Theorem 5: Lower bound on simple regret
  • Remark 3
  • ...and 36 more