Quantifying Normality: Convergence Rate to Gaussian Limit for Stochastic Approximation and Unadjusted OU Algorithm

Shaan Ul Haque; Zedong Wang; Zixuan Zhang; Siva Theja Maguluri

Quantifying Normality: Convergence Rate to Gaussian Limit for Stochastic Approximation and Unadjusted OU Algorithm

Shaan Ul Haque, Zedong Wang, Zixuan Zhang, Siva Theja Maguluri

TL;DR

This work addresses the finite-time accuracy of Gaussian approximations for stochastic approximation by deriving explicit Wasserstein-1 bounds between the rescaled SA iterates and their Gaussian limit under both diminishing and constant step-sizes. The core technique introduces a discrete Ornstein–Uhlenbeck process with generalized noise (DOUG) to capture the local, linearized stochastic dynamics and then leverages Stein's method to obtain non-asymptotic rates of convergence, including extensions to multiplicative noise via coupling. By decomposing SA dynamics into an error-trajectory relative to DOUG and the DOUG-to-Gaussian distance, the authors prove $ ilde{O}(\,\sqrt{\alpha_k}\,)$ convergence rates and provide tail and first-moment bounds, with direct implications for SGD in strongly convex settings. The results substantially improve finite-time understanding of normality in SA, offer practical error controls for finite-time applications, and connect stochastic approximation with sampling literature through the DOUG framework and Lyapunov-based analysis.

Abstract

Stochastic approximation (SA) is a method for finding the root of an operator perturbed by noise. There is a rich literature establishing the asymptotic normality of rescaled SA iterates under fairly mild conditions. However, these asymptotic results do not quantify the accuracy of the Gaussian approximation in finite time. In this paper, we establish explicit non-asymptotic bounds on the Wasserstein distance between the distribution of the rescaled iterate at time k and the asymptotic Gaussian limit for various choices of step-sizes including constant and polynomially decaying. As an immediate consequence, we obtain tail bounds on the error of SA iterates at any time. We obtain the sharp rates by first studying the convergence rate of the discrete Ornstein-Uhlenbeck (O-U) process driven by general noise, whose stationary distribution is identical to the limiting Gaussian distribution of the rescaled SA iterates. We believe that this is of independent interest, given its connection to sampling literature. The analysis involves adapting Stein's method for Gaussian approximation to handle the matrix weighted sum of i.i.d. random variables. The desired finite-time bounds for SA are obtained by characterizing the error dynamics between the rescaled SA iterate and the discrete time O-U process and combining it with the convergence rate of the latter process.

Quantifying Normality: Convergence Rate to Gaussian Limit for Stochastic Approximation and Unadjusted OU Algorithm

TL;DR

convergence rates and provide tail and first-moment bounds, with direct implications for SGD in strongly convex settings. The results substantially improve finite-time understanding of normality in SA, offer practical error controls for finite-time applications, and connect stochastic approximation with sampling literature through the DOUG framework and Lyapunov-based analysis.

Abstract

Paper Structure (34 sections, 23 theorems, 114 equations)

This paper contains 34 sections, 23 theorems, 114 equations.

Introduction
Langevin Dynamics and Normality of Stochastic Approximation
Main Contributions
Related Literature
Asymptotic Normality of diminishing step-size SA:
Asymptotic Normality for constant step-size SA:
Stein's method for Gaussian approximation:
Stein's method for distributional convergence of SA:
Finite-time Convergence for Unadjusted Langevin Dynamics:
Problem Setup
Main Results
Normality of Stochastic Approximation
Central Limit Theorem for step-size based averaging
Tight first moment bound
Tail Bounds for the rescaled iterate
...and 19 more sections

Key Result

Theorem 4.1

Let $\eta=\min(\iota_V/2, 3\gamma/4)$ and $\mathcal{E}_0=\mathbb{E}[\|x_0-x^*\|^2]$. The sequence $\{y_k\}_{k\geq 0}$ given by rescaling the iterate $x_k$ in update equation eq:SA_rec2 satisfies the following bounds for all $k\geq 0$. where $\tau_k=\mathbbm{1}_{\{2\iota_V\neq 3\gamma\}}+\mathbbm{1}_{\{2\iota_V=3\gamma\}}\log(k+K)$ and $\mathbbm{1}_{\cdot}$ is the indicator function.

Theorems & Definitions (43)

Remark
Remark
Remark
Theorem 4.1
Proposition 4.1
Corollary 4.1.1
Proposition 4.2
Theorem 4.2
Remark
Lemma 5.1: arnold1974stochastic
...and 33 more

Quantifying Normality: Convergence Rate to Gaussian Limit for Stochastic Approximation and Unadjusted OU Algorithm

TL;DR

Abstract

Quantifying Normality: Convergence Rate to Gaussian Limit for Stochastic Approximation and Unadjusted OU Algorithm

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (43)