A Jointly Efficient and Optimal Algorithm for Heteroskedastic Generalized Linear Bandits with Adversarial Corruptions

Sanghwa Kim; Junghyun Lee; Se-Young Yun

A Jointly Efficient and Optimal Algorithm for Heteroskedastic Generalized Linear Bandits with Adversarial Corruptions

Sanghwa Kim, Junghyun Lee, Se-Young Yun

TL;DR

The HCW-GLB-OMD is proposed, which consists of two components: an online mirror descent (OMD)-based estimator and Hessian-based confidence weights to achieve corruption robustness and a lower bound of $\tilde{\Omega}(d \sqrt{\sum_t g(\tau_t) \dot{\mu}_{t,\star}} + d C)$, unifying previous problem-specific lower bounds.

Abstract

We consider the problem of heteroskedastic generalized linear bandits (GLBs) with adversarial corruptions, which subsumes various stochastic contextual bandit settings, including heteroskedastic linear bandits and logistic/Poisson bandits. We propose HCW-GLB-OMD, which consists of two components: an online mirror descent (OMD)-based estimator and Hessian-based confidence weights to achieve corruption robustness. This is computationally efficient in that it only requires ${O}(1)$ space and time complexity per iteration. Under the self-concordance assumption on the link function, we show a regret bound of $\tilde{O}\left( d \sqrt{\sum_t g(τ_t) \dotμ_{t,\star}} + d^2 g_{\max} κ+ d κC \right)$, where $\dotμ_{t,\star}$ is the slope of $μ$ around the optimal arm at time $t$, $g(τ_t)$'s are potentially exogenously time-varying dispersions (e.g., $g(τ_t) = σ_t^2$ for heteroskedastic linear bandits, $g(τ_t) = 1$ for Bernoulli and Poisson), $g_{\max} = \max_{t \in [T]} g(τ_t)$ is the maximum dispersion, and $C \geq 0$ is the total corruption budget of the adversary. We complement this with a lower bound of $\tildeΩ(d \sqrt{\sum_t g(τ_t) \dotμ_{t,\star}} + d C)$, unifying previous problem-specific lower bounds. Thus, our algorithm achieves, up to a $κ$-factor in the corruption term, instance-wise minimax optimality simultaneously across various instances of heteroskedastic GLBs with adversarial corruptions.

A Jointly Efficient and Optimal Algorithm for Heteroskedastic Generalized Linear Bandits with Adversarial Corruptions

TL;DR

, unifying previous problem-specific lower bounds.

Abstract

space and time complexity per iteration. Under the self-concordance assumption on the link function, we show a regret bound of

, where

is the slope of

around the optimal arm at time

's are potentially exogenously time-varying dispersions (e.g.,

for heteroskedastic linear bandits,

for Bernoulli and Poisson),

is the maximum dispersion, and

is the total corruption budget of the adversary. We complement this with a lower bound of

, unifying previous problem-specific lower bounds. Thus, our algorithm achieves, up to a

-factor in the corruption term, instance-wise minimax optimality simultaneously across various instances of heteroskedastic GLBs with adversarial corruptions.

Paper Structure (54 sections, 19 theorems, 120 equations, 1 table)

This paper contains 54 sections, 19 theorems, 120 equations, 1 table.

Introduction
Problem Setting
Generalized Linear Bandits (GLBs).
GLBs with Time-Varying Dispersions.
Adversarial Corruptions.
Contributions
Notation
Our Algorithm: HCW-GLB-OMD
Brief Description of the Algorithm
Confidence Sequence and Regret Upper Bound
Proof Sketch of Theorem 5: Corruption-Robust Confidence Sequence
Technical Challenges.
Proof Sketch.
A Unified Regret Lower Bound
Tightness.
...and 39 more sections

Key Result

Theorem 5

Let $\delta \in (0, 1)$. Set the step size to $\eta = 1 + R_s S$ and regularization parameter to $\lambda = \max \left\{ 14 d \eta R_s^2, 36 \eta^2 \alpha^2 R_s^2 S^2 L_\mu^2, \frac{d}{4S^2} \right\}$. For each $t \in [T]$, define the confidence set as where ${\bm{\theta}}_t$ is the OMD estimator (Eqn. (eq:OMD-estimator)) and the radius $\beta_t(\delta)$ is given by Then, we have $\mathbb{P}( \f

Theorems & Definitions (27)

Example 1: Heterskedastic Gaussian Linear Bandits
Example 2: Logistic and Poisson Bandits
Remark 4: Different Adversaries
Theorem 5
Theorem 6
Remark 7: Necessity of Online Estimator
Theorem 8
Theorem 9
Lemma 10: Lemma 1 in zhang2025onepass
Lemma 11
...and 17 more

A Jointly Efficient and Optimal Algorithm for Heteroskedastic Generalized Linear Bandits with Adversarial Corruptions

TL;DR

Abstract

A Jointly Efficient and Optimal Algorithm for Heteroskedastic Generalized Linear Bandits with Adversarial Corruptions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (27)