Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis
Siddharth Chandak
TL;DR
This work develops finite-time guarantees for two-time-scale stochastic approximation where the slower time-scale uses a non-expansive mapping, modeling stochastic Krasnoselskii-Mann iterations. The authors derive a last-iterate mean-square residual decay of $O(k^{-1/4+\epsilon})$ and prove almost-sure convergence to the fixed-point set, under precise step-size separation and Lipschitz/martingale-noise assumptions. A projected-fast-time variant and a gradient-descent slower-time variant are analyzed, yielding corresponding rates and convergence results. The framework is demonstrated across three applications: minimax optimization, linear stochastic approximation, and Lagrangian-constrained optimization, with supplementary numerical experiments validating the theoretical predictions. Overall, the paper extends finite-time analysis to non-expansive slower dynamics, providing practically relevant rates and a versatile toolkit for two-time-scale stochastic algorithms.
Abstract
Two-time-scale stochastic approximation algorithms are iterative methods used in applications such as optimization, reinforcement learning, and control. Finite-time analysis of these algorithms has primarily focused on fixed point iterations where both time-scales have contractive mappings. In this work, we broaden the scope of such analyses by considering settings where the slower time-scale has a non-expansive mapping. For such algorithms, the slower time-scale can be viewed as a stochastic inexact Krasnoselskii-Mann iteration. We also study a variant where the faster time-scale has a projection step which leads to non-expansiveness in the slower time-scale. We show that the last-iterate mean square residual error for such algorithms decays at a rate $O(1/k^{1/4-ε})$, where $ε>0$ is arbitrarily small. We further establish almost sure convergence of iterates to the set of fixed points. We demonstrate the applicability of our framework by applying our results to minimax optimization, linear stochastic approximation, and Lagrangian optimization.
