Table of Contents
Fetching ...

Finite-Time Analysis of Projected Two-Time-Scale Stochastic Approximation

Yitao Bai, Thinh T. Doan, Justin Romberg

Abstract

We study the finite-time convergence of projected linear two-time-scale stochastic approximation with constant step sizes and Polyak--Ruppert averaging. We establish an explicit mean-square error bound, decomposing it into two interpretable components, an approximation error determined by the constrained subspace and a statistical error decaying at a sublinear rate, with constants expressed through restricted stability margins and a coupling invertibility condition. These constants cleanly separate the effect of subspace choice (approximation errors) from the effect of the averaging horizon (statistical errors). We illustrate our theoretical results through a number of numerical experiments on both synthetic and reinforcement learning problems.

Finite-Time Analysis of Projected Two-Time-Scale Stochastic Approximation

Abstract

We study the finite-time convergence of projected linear two-time-scale stochastic approximation with constant step sizes and Polyak--Ruppert averaging. We establish an explicit mean-square error bound, decomposing it into two interpretable components, an approximation error determined by the constrained subspace and a statistical error decaying at a sublinear rate, with constants expressed through restricted stability margins and a coupling invertibility condition. These constants cleanly separate the effect of subspace choice (approximation errors) from the effect of the averaging horizon (statistical errors). We illustrate our theoretical results through a number of numerical experiments on both synthetic and reinforcement learning problems.

Paper Structure

This paper contains 9 sections, 2 theorems, 35 equations, 3 figures.

Key Result

Theorem 1

Suppose that Assumptions as:block_stab--as:md hold. Let $\alpha,\beta>0$ be constant step sizes with $\beta/\alpha\ll 1$, $\alpha < 1/\|A_{ff}\|_2$, and $\beta < 1/\|A_{ss}\|_2$. Then we have where $L_x,L_y$ are the statistical constants and the approximation constants defined as $\blacktriangleleft$$\blacktriangleleft$

Figures (3)

  • Figure 3: Error decomposition of projected TTSA.
  • Figure 4: Synthetic coupled system.
  • Figure 5: Performance of GTD under different $\Phi$.

Theorems & Definitions (4)

  • Theorem 1
  • Remark 1
  • Lemma E.1: Projected linear solve
  • proof