Finite Sample and Large Deviations Analysis of Stochastic Gradient Algorithm with Correlated Noise
George Yin, Vikram Krishnamurthy
TL;DR
This work provides a finite-sample analysis of a projected stochastic gradient algorithm with correlated noise, proving that the mean-square error decays as $\mathbb{E}\|\theta_n-\theta^*\|^2 = O(1/n)$ and the regret grows at most like $\mathbb{E}\{\text{Regret}_n\} \le K L \log n$. The authors tackle correlated disturbances via a perturbed Lyapunov function $W = V + V_1$, which cancels problematic noise terms and yields a $O(1/n)$ drift, with an additional discussion of the i.i.d. case. They also analyze escape times from a neighborhood of the optimum using large-deviations theory, showing exponentially small escape probabilities and exponentially long expected residence times in the neighborhood. The results rely on convex, smooth objective structure, mixing assumptions on the noise, and a local-quadratic approximation around the minimizer, making the findings relevant for finite-sample guarantees in stochastic optimization under dependent noise.
Abstract
We analyze the finite sample regret of a decreasing step size stochastic gradient algorithm. We assume correlated noise and use a perturbed Lyapunov function as a systematic approach for the analysis. Finally we analyze the escape time of the iterates using large deviations theory.
