Convergence of projected stochastic approximation algorithm
Michał Borowski, Błażej Miasojedow
TL;DR
This work addresses the convergence of the Robbins–Monro stochastic approximation with projection onto a hyperrectangle, a case whose convergence proof has gaps in the classical Kushner–Yin framework. It employs the ODE method to recast the iteration as a discretization of the projected ODE $ẋ = h(x) - z$, $z ∈ N_K(x)$ and establishes almost sure equicontinuity of the associated processes $(X_n)$ and $(Z_n)$, enabling a limit-based convergence analysis via Arzelà–Ascoli. The main theoretical contribution is Theorem main, which shows that any limit of $(X_n,Z_n)$ satisfies the projected ODE and that limits are Lipschitz, leading to convergence of $x_n$ to stationary points under a Lyapunov stability condition (Theorem final). The results extend to proximal stochastic gradient methods and provide a more solid theoretical foundation for stochastic optimization techniques, including nonconvex and non-smooth settings, with relaxed assumptions on noise and learning rates.
Abstract
We study the Robbins-Monro stochastic approximation algorithm with projections on a hyperrectangle and prove its convergence. This work fills a gap in the convergence proof of the classic book by Kushner and Yin. Using the ODE method, we show that the algorithm converges to stationary points of a related projected ODE. Our results provide a better theoretical foundation for stochastic optimization techniques, including stochastic gradient descent and its proximal version. These results extend the algorithm's applicability and relax some assumptions of previous research.
