Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Kang Chen; Yasong Feng; Tianyu Wang

Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Kang Chen, Yasong Feng, Tianyu Wang

Abstract

Stochastic optimization via Stochastic Gradient Descent (SGD) is a fundamental problem in statistics and optimization. This paper revisits Stochastic Gradient Descent (SGD) for strongly convex objectives, establishing tight, uniform-in-time convergence bounds. We prove that, with probability at least $1 - β$, a convergence rate of order $\frac{\log \log k + \log (1/β)}{k}$ simultaneously holds for all $ k \in \mathbb{N}_+ $, and demonstrate this bound is tight up to constant factors. We also provide an improved last-iterate convergence rate for such objectives. While focused on strongly convex objectives, our results generalize to the Polyak-Łojasiewicz functions and indicate an $\mathcal{O}(k^{-1} \log \log k)$ convergence rate for contractive stochastic approximation with additive noise.

Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Abstract

, a convergence rate of order

simultaneously holds for all

, and demonstrate this bound is tight up to constant factors. We also provide an improved last-iterate convergence rate for such objectives. While focused on strongly convex objectives, our results generalize to the Polyak-Łojasiewicz functions and indicate an

convergence rate for contractive stochastic approximation with additive noise.

Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Abstract

Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (6)