Quantifying Normality: Convergence Rate to Gaussian Limit for Stochastic Approximation and Unadjusted OU Algorithm
Shaan Ul Haque, Zedong Wang, Zixuan Zhang, Siva Theja Maguluri
TL;DR
This work addresses the finite-time accuracy of Gaussian approximations for stochastic approximation by deriving explicit Wasserstein-1 bounds between the rescaled SA iterates and their Gaussian limit under both diminishing and constant step-sizes. The core technique introduces a discrete Ornstein–Uhlenbeck process with generalized noise (DOUG) to capture the local, linearized stochastic dynamics and then leverages Stein's method to obtain non-asymptotic rates of convergence, including extensions to multiplicative noise via coupling. By decomposing SA dynamics into an error-trajectory relative to DOUG and the DOUG-to-Gaussian distance, the authors prove $ ilde{O}(\,\sqrt{\alpha_k}\,)$ convergence rates and provide tail and first-moment bounds, with direct implications for SGD in strongly convex settings. The results substantially improve finite-time understanding of normality in SA, offer practical error controls for finite-time applications, and connect stochastic approximation with sampling literature through the DOUG framework and Lyapunov-based analysis.
Abstract
Stochastic approximation (SA) is a method for finding the root of an operator perturbed by noise. There is a rich literature establishing the asymptotic normality of rescaled SA iterates under fairly mild conditions. However, these asymptotic results do not quantify the accuracy of the Gaussian approximation in finite time. In this paper, we establish explicit non-asymptotic bounds on the Wasserstein distance between the distribution of the rescaled iterate at time k and the asymptotic Gaussian limit for various choices of step-sizes including constant and polynomially decaying. As an immediate consequence, we obtain tail bounds on the error of SA iterates at any time. We obtain the sharp rates by first studying the convergence rate of the discrete Ornstein-Uhlenbeck (O-U) process driven by general noise, whose stationary distribution is identical to the limiting Gaussian distribution of the rescaled SA iterates. We believe that this is of independent interest, given its connection to sampling literature. The analysis involves adapting Stein's method for Gaussian approximation to handle the matrix weighted sum of i.i.d. random variables. The desired finite-time bounds for SA are obtained by characterizing the error dynamics between the rescaled SA iterate and the discrete time O-U process and combining it with the convergence rate of the latter process.
