Gaussian Approximation for Two-Timescale Linear Stochastic Approximation
Bogdan Butyrin, Artemy Rubtsov, Alexey Naumov, Vladimir Ulyanov, Sergey Samsonov
TL;DR
This work establishes non-asymptotic Gaussian approximation rates for linear two-timescale stochastic approximation, addressing both martingale-difference and Markov noise. Using a decomposition framework that splits the statistic into a linear part plus a small nonlinear remainder, the authors prove high-order moment bounds and derive explicit convex-distance rates for both the Polyak–Ruppert averaged TTSA and the last iterate, with rates up to n^{-1/4} in the martingale case and n^{-1/6} under Markov noise. A key insight is the distinct influence of timescale separation on the last iterate versus averaging, and the analysis extends to TTSA implementations like GTD and TDC in reinforcement learning. These results enable non-asymptotic statistical inference for TTSA-based algorithms and pave the way for confidence-interval construction and tighter optimality bounds in practice.
Abstract
In this paper, we establish non-asymptotic bounds for accuracy of normal approximation for linear two-timescale stochastic approximation (TTSA) algorithms driven by martingale difference or Markov noise. Focusing on both the last iterate and Polyak-Ruppert averaging regimes, we derive bounds for normal approximation in terms of the convex distance between probability distributions. Our analysis reveals a non-trivial interaction between the fast and slow timescales: the normal approximation rate for the last iterate improves as the timescale separation increases, while it decreases in the Polyak-Ruppert averaged setting. We also provide the high-order moment bounds for the error of linear TTSA algorithm, which may be of independent interest.
