A Non-Asymptotic Theory of Seminorm Lyapunov Stability: From Deterministic to Stochastic Iterative Algorithms
Zaiwei Chen, Sheng Zhang, Zhe Zhang, Shaan Ul Haque, Siva Theja Maguluri
TL;DR
This work develops a non-asymptotic theory for fixed-point problems governed by seminorm-contractive operators, establishing a Seminorm Fixed-Point Theorem and a Seminorm Lyapunov stability framework that extends classical results to kernels that capture unstable subspaces. It then builds a Lyapunov-based finite-sample analysis for Markovian stochastic approximation, including both general and linear SA cases, without requiring Hurwitz stability. The theory is applied to average-reward reinforcement learning, showing finite-sample guarantees for TD($\lambda$) with linear function approximation and for synchronous Q-learning, all within the seminorm contraction paradigm. Overall, the paper provides a unified, non-asymptotic toolkit for analyzing fixed-point problems and RL algorithms where the Bellman/Fixed-point operators are contractive only with respect to a seminorm rather than a norm.
Abstract
We study the problem of solving fixed-point equations for seminorm-contractive operators and establish foundational results on the non-asymptotic behavior of iterative algorithms in both deterministic and stochastic settings. Specifically, in the deterministic setting, we prove a fixed-point theorem for seminorm-contractive operators, showing that iterates converge geometrically to the kernel of the seminorm. In the stochastic setting, we analyze the corresponding stochastic approximation (SA) algorithm under seminorm-contractive operators and Markovian noise, providing a finite-sample analysis for various stepsize choices. A benchmark for equation solving is linear systems of equations, where the convergence behavior of fixed-point iteration is closely tied to the stability of linear dynamical systems. In this special case, our results provide a complete characterization of system stability with respect to a seminorm, linking it to the solution of a Lyapunov equation in terms of positive semi-definite matrices. In the stochastic setting, we establish a finite-sample analysis for linear Markovian SA without requiring the Hurwitzness assumption. Our theoretical results offer a unified framework for deriving finite-sample bounds for various reinforcement learning algorithms in the average reward setting, including TD($λ$) for policy evaluation (which is a special case of solving a Poisson equation) and Q-learning for control.
