Fixed Point Theory Analysis of a Lambda Policy Iteration with Randomization for the Ćirić Contraction Operator
Abdelkader Belhenniche, Roman Chertovskih
TL;DR
This paper extends Lambda policy iteration with randomization to the broader class of Ćirić contractions within a fixed-point framework, moving beyond standard Banach contractions to weak contractions. It defines the operators $F_\mu$, $F$, and the λ-operator $F_\mu^{\lambda}$, proves existence and uniqueness of fixed points $V^{*}$ and $V_{\mu}$, and establishes almost-sure convergence of the λ-PIR scheme in infinite-dimensional policy spaces. Key results include $V^{*}=\inf_{\mu} V_{\mu}$ and a contraction-like bound for $F_\mu^{\lambda}$ with contraction factor $\rho$, ensuring convergence under weaker assumptions. The work broadens RL-based feedback-control guarantees to discontinuous or weakly contractive mappings, offering theoretical convergence guarantees in high-dimensional policy settings.
Abstract
We apply methods of the fixed point theory to a Lambda policy iteration with a randomization algorithm for weak contractions mappings. This type of mappings covers a broader range than the strong contractions typically considered in the literature, such as Ćirić contraction. Specifically, we explore the characteristics of reinforcement learning procedures developed for feedback control within the context of fixed point theory. Under relatively general assumptions, we identify the sufficient conditions for convergence with a probability of one in infinite-dimensional policy spaces.
