Median Clipping for Zeroth-order Non-Smooth Convex Optimization and Multi-Armed Bandit Problem with Heavy-tailed Symmetric Noise

Nikita Kornilov; Yuriy Dorn; Aleksandr Lobanov; Nikolay Kutuzov; Innokentiy Shibaev; Eduard Gorbunov; Alexander Nazin; Alexander Gasnikov

Median Clipping for Zeroth-order Non-Smooth Convex Optimization and Multi-Armed Bandit Problem with Heavy-tailed Symmetric Noise

Nikita Kornilov, Yuriy Dorn, Aleksandr Lobanov, Nikolay Kutuzov, Innokentiy Shibaev, Eduard Gorbunov, Alexander Nazin, Alexander Gasnikov

TL;DR

The paper tackles non-smooth convex zeroth-order optimization under symmetric heavy-tailed noise, where observations can have unbounded moments. It introduces a novel zeroth-order oracle and median-based gradient estimation with clipping, yielding high-probability convergence rates that remain optimal for bounded-variance settings across any κ>0. Two main algorithms are proposed: ZO-clipped-med-SSTM for unconstrained problems and ZO-clipped-med-SMD for constrained domains, both leveraging batched median estimates of gradient differences to achieve robust performance. The methods extend to stochastic multi-armed bandits, with Clipped-INF-med-SMD delivering a $ ilde{O}( oot 4 obreak o obreak ext{d}} \

Abstract

In this paper, we consider non-smooth convex optimization with a zeroth-order oracle corrupted by symmetric stochastic noise. Unlike the existing high-probability results requiring the noise to have bounded $κ$-th moment with $κ\in (1,2]$, our results allow even heavier noise with any $κ> 0$, e.g., the noise distribution can have unbounded expectation. Our convergence rates match the best-known ones for the case of the bounded variance, namely, to achieve function accuracy $\varepsilon$ our methods with Lipschitz oracle require $\tilde{O}(d^2\varepsilon^{-2})$ iterations for any $κ> 0$. We build the median gradient estimate with bounded second moment as the mini-batched median of the sampled gradient differences. We apply this technique to the stochastic multi-armed bandit problem with heavy-tailed distribution of rewards and achieve $\tilde{O}(\sqrt{dT})$ regret. We demonstrate the performance of our zeroth-order and MAB algorithms for various $κ\in (0,2]$ on synthetic and real-world data. Our methods do not lose to SOTA approaches and dramatically outperform them for $κ\leq 1$.

Median Clipping for Zeroth-order Non-Smooth Convex Optimization and Multi-Armed Bandit Problem with Heavy-tailed Symmetric Noise

TL;DR

Abstract

-th moment with

, our results allow even heavier noise with any

, e.g., the noise distribution can have unbounded expectation. Our convergence rates match the best-known ones for the case of the bounded variance, namely, to achieve function accuracy

our methods with Lipschitz oracle require

iterations for any

. We build the median gradient estimate with bounded second moment as the mini-batched median of the sampled gradient differences. We apply this technique to the stochastic multi-armed bandit problem with heavy-tailed distribution of rewards and achieve

regret. We demonstrate the performance of our zeroth-order and MAB algorithms for various

on synthetic and real-world data. Our methods do not lose to SOTA approaches and dramatically outperform them for

Paper Structure (38 sections, 15 theorems, 102 equations, 7 figures, 1 table, 4 algorithms)

This paper contains 38 sections, 15 theorems, 102 equations, 7 figures, 1 table, 4 algorithms.

Introduction
Contributions
Preliminaries
Notations
Assumptions
Randomized smoothing
Clipping
Zeroth-order optimization with symmetric heavy-tailed noise
New zeroth-order noise concept and integration in median estimation
Zeroth-order two-point oracle
Symmetric heavy-tailed noise
Median estimation
Our ZO-clipped-med-SSTM for unconstrained problems
Discussion
Other classes of the optimized functions
...and 23 more sections

Key Result

Lemma 1

Consider $\mu$-strongly convex (As. as:f convex) and $M_2$-Lipschitz (As. as:Lipshcitz) function $f$ on $Q + B_{2\tau}(0) \subseteq \mathbb{R}^d$. For the smoothed function $\hat{f}_\tau$ defined in hat_f, the following properties hold true:

Figures (7)

Figure 1: Average expected regret and probability of optimal arm picking mean for $100$ experiments and $30000$ samples with $0.95$ and $0.05$ percentiles for regret and ± std bounds for probabilities.
Figure 2: Strategies profit coefficient and Clipped-INF-med-SMD assets distribution over 2023.
Figure 3: Convergence of our ZO-clipped-SSTM and ZO-clipped-med-SSTM, ZO-clipped-SGD, ZO-clipped-med-SGD over $15$ launches.
Figure 4: Convergence of our ZO-clipped-SSTM and ZO-clipped-med-SSTM, ZO-clipped-SGD, ZO-clipped-med-SGD in terms of a gap function w.r.t. the number of used samples from the dataset for different $\alpha = \kappa$ parameters (left-to-right and top-to-bottom: 0.75, 1.0, 1.25, 1.5).
Figure 5: Convergence of our ZO-clipped-SSTM and ZO-clipped-med-SSTM, ZO-clipped-SGD, ZO-clipped-med-SGD with asymmetric Levy noise addition with weight of symmetric part of $0.9$ and $0.5$ on left and right, respectively.
...and 2 more figures

Theorems & Definitions (21)

Lemma 1: gasnikov2022power, Theorem 2.1
Lemma 2: Median estimate's properties
Theorem 1: Convergence of ZO-clipped-med-SSTM
Remark 1: Smooth objective
Remark 2: Polyak–Lojasiewicz objective
Theorem 2: Convergence of ZO-clipped-med-SMD
Theorem 3: Convergence of Clipped-INF-med-SMD
Remark 3: Comparison with previous assumptions
Remark 4: Role of the scale function $B(x,y)$
Remark 5: Standard oracles examples
...and 11 more

Median Clipping for Zeroth-order Non-Smooth Convex Optimization and Multi-Armed Bandit Problem with Heavy-tailed Symmetric Noise

TL;DR

Abstract

Median Clipping for Zeroth-order Non-Smooth Convex Optimization and Multi-Armed Bandit Problem with Heavy-tailed Symmetric Noise

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (21)