The Collusion of Memory and Nonlinearity in Stochastic Approximation With Constant Stepsize
Dongyan Huo, Yixuan Zhang, Yudong Chen, Qiaomin Xie
TL;DR
This work addresses constant-stepsize stochastic approximation with nonlinear updates and Markovian data, a setting where memory and nonlinearity interact in complex ways. The authors develop a fine-grained analysis that yields the first weak convergence of the joint process and a precise asymptotic bias decomposition into Markovian, nonlinear, and interaction terms, plus higher-moment and CLT results for the averaged iterates. They prove both projected and projection-free weak convergence under distinct conditions and provide non-asymptotic rates, establishing practical implications for bias-robust inference via RR extrapolation. The results apply to GLMs with Markov data, including logistic and smooth-ReLU-type models, offering a rigorous foundation for reliable learning and inference in dependent-data nonlinear SA. Altogether, the paper advances understanding of SA under memory and nonlinearity, with concrete algorithmic recommendations for bias reduction and statistical testing.
Abstract
In this work, we investigate stochastic approximation (SA) with Markovian data and nonlinear updates under constant stepsize $α>0$. Existing work has primarily focused on either i.i.d. data or linear update rules. We take a new perspective and carefully examine the simultaneous presence of Markovian dependency of data and nonlinear update rules, delineating how the interplay between these two structures leads to complications that are not captured by prior techniques. By leveraging the smoothness and recurrence properties of the SA updates, we develop a fine-grained analysis of the correlation between the SA iterates $θ_k$ and Markovian data $x_k$. This enables us to overcome the obstacles in existing analysis and establish for the first time the weak convergence of the joint process $(x_k, θ_k)_{k\geq0}$. Furthermore, we present a precise characterization of the asymptotic bias of the SA iterates, given by $\mathbb{E}[θ_\infty]-θ^\ast=α(b_\text{m}+b_\text{n}+b_\text{c})+O(α^{3/2})$. Here, $b_\text{m}$ is associated with the Markovian noise, $b_\text{n}$ is tied to the nonlinearity, and notably, $b_\text{c}$ represents a multiplicative interaction between the Markovian noise and nonlinearity, which is absent in previous works. As a by-product of our analysis, we derive finite-time bounds on higher moment $\mathbb{E}[\|θ_k-θ^\ast\|^{2p}]$ and present non-asymptotic geometric convergence rates for the iterates, along with a Central Limit Theorem.
