Table of Contents
Fetching ...

Online Stochastic Gradient Methods Under Sub-Weibull Noise and the Polyak-Łojasiewicz Condition

Seunghyun Kim, Liam Madden, Emiliano Dall'Anese

TL;DR

This work studies online gradient and proximal-gradient methods under stochastic gradient errors for time-varying objectives that satisfy the Polyak-Łojasiewicz (PL) or proximal-PL conditions. By modeling gradient errors with a sub-Weibull distribution, the authors derive both expectation and high-probability, iteration-wise regret bounds, showing linear convergence up to a disturbance that scales with problem drift and noise variability. The online proximal-gradient analysis extends these results to composite objectives, yielding analogous mean and concentration bounds. Numerical experiments on time-varying LS regression and real-time demand-response tasks validate the theoretical findings and illustrate the practical impact of drift and sub-Weibull noise on online learning performance.

Abstract

This paper focuses on the online gradient and proximal-gradient methods with stochastic gradient errors. In particular, we examine the performance of the online gradient descent method when the cost satisfies the Polyak-Łojasiewicz (PL) inequality. We provide bounds in expectation and in high probability (that hold iteration-wise), with the latter derived by leveraging a sub-Weibull model for the errors affecting the gradient. The convergence results show that the instantaneous regret converges linearly up to an error that depends on the variability of the problem and the statistics of the sub-Weibull gradient error. Similar convergence results are then provided for the online proximal-gradient method, under the assumption that the composite cost satisfies the proximal-PL condition. In the case of static costs, we provide new bounds for the regret incurred by these methods when the gradient errors are modeled as sub-Weibull random variables. Illustrative simulations are provided to corroborate the technical findings.

Online Stochastic Gradient Methods Under Sub-Weibull Noise and the Polyak-Łojasiewicz Condition

TL;DR

This work studies online gradient and proximal-gradient methods under stochastic gradient errors for time-varying objectives that satisfy the Polyak-Łojasiewicz (PL) or proximal-PL conditions. By modeling gradient errors with a sub-Weibull distribution, the authors derive both expectation and high-probability, iteration-wise regret bounds, showing linear convergence up to a disturbance that scales with problem drift and noise variability. The online proximal-gradient analysis extends these results to composite objectives, yielding analogous mean and concentration bounds. Numerical experiments on time-varying LS regression and real-time demand-response tasks validate the theoretical findings and illustrate the practical impact of drift and sub-Weibull noise on online learning performance.

Abstract

This paper focuses on the online gradient and proximal-gradient methods with stochastic gradient errors. In particular, we examine the performance of the online gradient descent method when the cost satisfies the Polyak-Łojasiewicz (PL) inequality. We provide bounds in expectation and in high probability (that hold iteration-wise), with the latter derived by leveraging a sub-Weibull model for the errors affecting the gradient. The convergence results show that the instantaneous regret converges linearly up to an error that depends on the variability of the problem and the statistics of the sub-Weibull gradient error. Similar convergence results are then provided for the online proximal-gradient method, under the assumption that the composite cost satisfies the proximal-PL condition. In the case of static costs, we provide new bounds for the regret incurred by these methods when the gradient errors are modeled as sub-Weibull random variables. Illustrative simulations are provided to corroborate the technical findings.

Paper Structure

This paper contains 11 sections, 7 theorems, 36 equations, 3 figures.

Key Result

Lemma II.1

(Closure of sub-Weibull classbastianello2021stochastic) Let $X_i \sim \mathrm{subW}(\theta_i, K_i)$, $i = 1,2$, based on Definition def:sub-weibull(ii).

Figures (3)

  • Figure 1: Inexact OGD: Evolution of average regret obtained experimentally, the empirical $3$-standard deviation confidence interval, and the theoretical bound.
  • Figure 2: Demand response application: non-controllable power ${\bf a}_w^\top {\bf w}_t$ and reference point $p_{0,t}^{\mathrm{ref}}$ for the active power $p_{0,t}$.
  • Figure 3: Demand response application: Evolution of average regret obtained experimentally; the zoomed area also provides the empirical $3-\sigma$ confidence interval.

Theorems & Definitions (17)

  • Definition 1: Polyak-Ł ojasiewicz (PL) Inequality
  • Definition 2: Proximal-PL Condition
  • Definition 3: Sub-Weibull rv vladimirova2020sub
  • Lemma II.1
  • Lemma II.2
  • Lemma II.3
  • Lemma II.4: High probability bound
  • Theorem III.1: Convergence of the stochastic OGD
  • Corollary III.2: Asymptotic convergence
  • Remark 1: Static optimization karimi2016linear
  • ...and 7 more