Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis

Siddharth Chandak

Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis

Siddharth Chandak

TL;DR

This work develops finite-time guarantees for two-time-scale stochastic approximation where the slower time-scale uses a non-expansive mapping, modeling stochastic Krasnoselskii-Mann iterations. The authors derive a last-iterate mean-square residual decay of $O(k^{-1/4+\epsilon})$ and prove almost-sure convergence to the fixed-point set, under precise step-size separation and Lipschitz/martingale-noise assumptions. A projected-fast-time variant and a gradient-descent slower-time variant are analyzed, yielding corresponding rates and convergence results. The framework is demonstrated across three applications: minimax optimization, linear stochastic approximation, and Lagrangian-constrained optimization, with supplementary numerical experiments validating the theoretical predictions. Overall, the paper extends finite-time analysis to non-expansive slower dynamics, providing practically relevant rates and a versatile toolkit for two-time-scale stochastic algorithms.

Abstract

Two-time-scale stochastic approximation algorithms are iterative methods used in applications such as optimization, reinforcement learning, and control. Finite-time analysis of these algorithms has primarily focused on fixed point iterations where both time-scales have contractive mappings. In this work, we broaden the scope of such analyses by considering settings where the slower time-scale has a non-expansive mapping. For such algorithms, the slower time-scale can be viewed as a stochastic inexact Krasnoselskii-Mann iteration. We also study a variant where the faster time-scale has a projection step which leads to non-expansiveness in the slower time-scale. We show that the last-iterate mean square residual error for such algorithms decays at a rate $O(1/k^{1/4-ε})$, where $ε>0$ is arbitrarily small. We further establish almost sure convergence of iterates to the set of fixed points. We demonstrate the applicability of our framework by applying our results to minimax optimization, linear stochastic approximation, and Lagrangian optimization.

Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis

TL;DR

and prove almost-sure convergence to the fixed-point set, under precise step-size separation and Lipschitz/martingale-noise assumptions. A projected-fast-time variant and a gradient-descent slower-time variant are analyzed, yielding corresponding rates and convergence results. The framework is demonstrated across three applications: minimax optimization, linear stochastic approximation, and Lagrangian-constrained optimization, with supplementary numerical experiments validating the theoretical predictions. Overall, the paper extends finite-time analysis to non-expansive slower dynamics, providing practically relevant rates and a versatile toolkit for two-time-scale stochastic algorithms.

Abstract

, where

is arbitrarily small. We further establish almost sure convergence of iterates to the set of fixed points. We demonstrate the applicability of our framework by applying our results to minimax optimization, linear stochastic approximation, and Lagrangian optimization.

Paper Structure (29 sections, 15 theorems, 127 equations, 3 figures, 1 table)

This paper contains 29 sections, 15 theorems, 127 equations, 3 figures, 1 table.

Introduction
Related Work
Outline and Notation
Non-Expansive Mappings in Slower Time-Scale
Projected Variant
Proof Outline
Gradient Descent Operators in Slower Time-Scale
Applications
Minimax Optimization
Linear Stochastic Approximation
Constrained Optimization with Lagrangian Multipliers
Numerical Experiments
Conclusion
Proofs required for Theorem \ref{['thm:main']}
Proof for Lemma \ref{['lemma:xstar-lip']}
...and 14 more sections

Key Result

Theorem 2.6

\newlabelthm:main0 Suppose that Assumptions f-contrac-assu:stepsize hold. Let $\{x_k,y_k\}$ be generated by iter-main with $\beta/\alpha\leq \gamma_1$ and $K_1\geq \gamma_2$. Then,

Figures (3)

Figure 1: Linear Stochastic Approximation: effect of stepsize sequence on residual errors
Figure 2: Minimax Optimization: residual errors for the two time-scales
Figure 3: Constrained optimization: (a) effect of stepsize on residual error, and (b) comparison with the case where there is no projection in the faster time-scale

Theorems & Definitions (30)

Theorem 2.6
Theorem 2.8
Lemma 3.1
Lemma 3.2
Lemma 3.3
Lemma 3.4
Lemma 3.5
Theorem 4.3
Corollary 5.2
Corollary 5.4
...and 20 more

Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis

TL;DR

Abstract

Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (30)