Table of Contents
Fetching ...

Quantifying Normality: Convergence Rate to Gaussian Limit for Stochastic Approximation and Unadjusted OU Algorithm

Shaan Ul Haque, Zedong Wang, Zixuan Zhang, Siva Theja Maguluri

TL;DR

This work addresses the finite-time accuracy of Gaussian approximations for stochastic approximation by deriving explicit Wasserstein-1 bounds between the rescaled SA iterates and their Gaussian limit under both diminishing and constant step-sizes. The core technique introduces a discrete Ornstein–Uhlenbeck process with generalized noise (DOUG) to capture the local, linearized stochastic dynamics and then leverages Stein's method to obtain non-asymptotic rates of convergence, including extensions to multiplicative noise via coupling. By decomposing SA dynamics into an error-trajectory relative to DOUG and the DOUG-to-Gaussian distance, the authors prove $ ilde{O}(\,\sqrt{\alpha_k}\,)$ convergence rates and provide tail and first-moment bounds, with direct implications for SGD in strongly convex settings. The results substantially improve finite-time understanding of normality in SA, offer practical error controls for finite-time applications, and connect stochastic approximation with sampling literature through the DOUG framework and Lyapunov-based analysis.

Abstract

Stochastic approximation (SA) is a method for finding the root of an operator perturbed by noise. There is a rich literature establishing the asymptotic normality of rescaled SA iterates under fairly mild conditions. However, these asymptotic results do not quantify the accuracy of the Gaussian approximation in finite time. In this paper, we establish explicit non-asymptotic bounds on the Wasserstein distance between the distribution of the rescaled iterate at time k and the asymptotic Gaussian limit for various choices of step-sizes including constant and polynomially decaying. As an immediate consequence, we obtain tail bounds on the error of SA iterates at any time. We obtain the sharp rates by first studying the convergence rate of the discrete Ornstein-Uhlenbeck (O-U) process driven by general noise, whose stationary distribution is identical to the limiting Gaussian distribution of the rescaled SA iterates. We believe that this is of independent interest, given its connection to sampling literature. The analysis involves adapting Stein's method for Gaussian approximation to handle the matrix weighted sum of i.i.d. random variables. The desired finite-time bounds for SA are obtained by characterizing the error dynamics between the rescaled SA iterate and the discrete time O-U process and combining it with the convergence rate of the latter process.

Quantifying Normality: Convergence Rate to Gaussian Limit for Stochastic Approximation and Unadjusted OU Algorithm

TL;DR

This work addresses the finite-time accuracy of Gaussian approximations for stochastic approximation by deriving explicit Wasserstein-1 bounds between the rescaled SA iterates and their Gaussian limit under both diminishing and constant step-sizes. The core technique introduces a discrete Ornstein–Uhlenbeck process with generalized noise (DOUG) to capture the local, linearized stochastic dynamics and then leverages Stein's method to obtain non-asymptotic rates of convergence, including extensions to multiplicative noise via coupling. By decomposing SA dynamics into an error-trajectory relative to DOUG and the DOUG-to-Gaussian distance, the authors prove convergence rates and provide tail and first-moment bounds, with direct implications for SGD in strongly convex settings. The results substantially improve finite-time understanding of normality in SA, offer practical error controls for finite-time applications, and connect stochastic approximation with sampling literature through the DOUG framework and Lyapunov-based analysis.

Abstract

Stochastic approximation (SA) is a method for finding the root of an operator perturbed by noise. There is a rich literature establishing the asymptotic normality of rescaled SA iterates under fairly mild conditions. However, these asymptotic results do not quantify the accuracy of the Gaussian approximation in finite time. In this paper, we establish explicit non-asymptotic bounds on the Wasserstein distance between the distribution of the rescaled iterate at time k and the asymptotic Gaussian limit for various choices of step-sizes including constant and polynomially decaying. As an immediate consequence, we obtain tail bounds on the error of SA iterates at any time. We obtain the sharp rates by first studying the convergence rate of the discrete Ornstein-Uhlenbeck (O-U) process driven by general noise, whose stationary distribution is identical to the limiting Gaussian distribution of the rescaled SA iterates. We believe that this is of independent interest, given its connection to sampling literature. The analysis involves adapting Stein's method for Gaussian approximation to handle the matrix weighted sum of i.i.d. random variables. The desired finite-time bounds for SA are obtained by characterizing the error dynamics between the rescaled SA iterate and the discrete time O-U process and combining it with the convergence rate of the latter process.
Paper Structure (34 sections, 23 theorems, 114 equations)

This paper contains 34 sections, 23 theorems, 114 equations.

Key Result

Theorem 4.1

Let $\eta=\min(\iota_V/2, 3\gamma/4)$ and $\mathcal{E}_0=\mathbb{E}[\|x_0-x^*\|^2]$. The sequence $\{y_k\}_{k\geq 0}$ given by rescaling the iterate $x_k$ in update equation eq:SA_rec2 satisfies the following bounds for all $k\geq 0$. where $\tau_k=\mathbbm{1}_{\{2\iota_V\neq 3\gamma\}}+\mathbbm{1}_{\{2\iota_V=3\gamma\}}\log(k+K)$ and $\mathbbm{1}_{\cdot}$ is the indicator function.

Theorems & Definitions (43)

  • Remark
  • Remark
  • Remark
  • Theorem 4.1
  • Proposition 4.1
  • Corollary 4.1.1
  • Proposition 4.2
  • Theorem 4.2
  • Remark
  • Lemma 5.1: arnold1974stochastic
  • ...and 33 more