Nonconvex optimization and convergence of stochastic gradient descent, and solution of asynchronous game

Kevin Buck; Jessica Babyak; Paolo Piersanti; Kevin Zumbrun; Christiane Gallos; Dorothea Gallos

Nonconvex optimization and convergence of stochastic gradient descent, and solution of asynchronous game

Kevin Buck, Jessica Babyak, Paolo Piersanti, Kevin Zumbrun, Christiane Gallos, Dorothea Gallos

TL;DR

The work investigates convergence of stochastic gradient methods for both convex and nonconvex objectives, identifying step-size regimes, time-averaging, and stochastic-coordinate variants that guarantee convergence to the critical set $\mathcal{C}$ under mild conditions. It develops a unified stochastic-approximation framework with key energy-type estimates, analyzes deterministic and stochastic cases, and extends to approximate convexity where $f$ behaves like a strongly convex function near the optimum. The authors connect SGD to continuous-time dynamics and Fokker–Planck equations, providing intuition and numerical schemes (e.g., Crank–Nicolson) to study diffusion-like behavior and equilibrium distributions. A major new contribution is applying these methods to two- and asynchronous multi-player games by smoothing the max via $\ell^p$ norms, yielding convergent SGD updates in nonconvex game settings and offering practical insights for smoothing, step-size design, and potential multigrid strategies for large-scale problems.

Abstract

We review convergence and behavior of stochastic gradient descent for convex and nonconvex optimization, establishing various conditions for convergence to zero of the variance of the gradient of the objective function, and presenting a number of simple examples demonstrating the approximate evolution of the probability density under iteration, including applications to both classical two-player and asynchronous multiplayer games

Nonconvex optimization and convergence of stochastic gradient descent, and solution of asynchronous game

TL;DR

under mild conditions. It develops a unified stochastic-approximation framework with key energy-type estimates, analyzes deterministic and stochastic cases, and extends to approximate convexity where

behaves like a strongly convex function near the optimum. The authors connect SGD to continuous-time dynamics and Fokker–Planck equations, providing intuition and numerical schemes (e.g., Crank–Nicolson) to study diffusion-like behavior and equilibrium distributions. A major new contribution is applying these methods to two- and asynchronous multi-player games by smoothing the max via

norms, yielding convergent SGD updates in nonconvex game settings and offering practical insights for smoothing, step-size design, and potential multigrid strategies for large-scale problems.

Nonconvex optimization and convergence of stochastic gradient descent, and solution of asynchronous game

TL;DR

Abstract

Nonconvex optimization and convergence of stochastic gradient descent, and solution of asynchronous game

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (45)