Table of Contents
Fetching ...

Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Kang Chen, Yasong Feng, Tianyu Wang

Abstract

Stochastic optimization via Stochastic Gradient Descent (SGD) is a fundamental problem in statistics and optimization. This paper revisits Stochastic Gradient Descent (SGD) for strongly convex objectives, establishing tight, uniform-in-time convergence bounds. We prove that, with probability at least $1 - β$, a convergence rate of order $\frac{\log \log k + \log (1/β)}{k}$ simultaneously holds for all $ k \in \mathbb{N}_+ $, and demonstrate this bound is tight up to constant factors. We also provide an improved last-iterate convergence rate for such objectives. While focused on strongly convex objectives, our results generalize to the Polyak-Łojasiewicz functions and indicate an $\mathcal{O}(k^{-1} \log \log k)$ convergence rate for contractive stochastic approximation with additive noise.

Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Abstract

Stochastic optimization via Stochastic Gradient Descent (SGD) is a fundamental problem in statistics and optimization. This paper revisits Stochastic Gradient Descent (SGD) for strongly convex objectives, establishing tight, uniform-in-time convergence bounds. We prove that, with probability at least , a convergence rate of order simultaneously holds for all , and demonstrate this bound is tight up to constant factors. We also provide an improved last-iterate convergence rate for such objectives. While focused on strongly convex objectives, our results generalize to the Polyak-Łojasiewicz functions and indicate an convergence rate for contractive stochastic approximation with additive noise.

Paper Structure

This paper contains 4 sections, 2 theorems, 10 equations, 1 table.

Key Result

Proposition 1

By Theorem 2.6 in wainwright2019high, up to constants, the Conditional 1-sub-Gaussian condition in Assumption assump:subGaussian implies for any $\varphi$ that is $\mathcal{F}_k$-measurable.

Theorems & Definitions (6)

  • Remark 1
  • Remark 2
  • Proposition 1
  • Remark 3
  • Lemma 1
  • proof