Fixed Point Theory Analysis of a Lambda Policy Iteration with Randomization for the Ćirić Contraction Operator

Abdelkader Belhenniche; Roman Chertovskih

Fixed Point Theory Analysis of a Lambda Policy Iteration with Randomization for the Ćirić Contraction Operator

Abdelkader Belhenniche, Roman Chertovskih

TL;DR

This paper extends Lambda policy iteration with randomization to the broader class of Ćirić contractions within a fixed-point framework, moving beyond standard Banach contractions to weak contractions. It defines the operators $F_\mu$, $F$, and the λ-operator $F_\mu^{\lambda}$, proves existence and uniqueness of fixed points $V^{*}$ and $V_{\mu}$, and establishes almost-sure convergence of the λ-PIR scheme in infinite-dimensional policy spaces. Key results include $V^{*}=\inf_{\mu} V_{\mu}$ and a contraction-like bound for $F_\mu^{\lambda}$ with contraction factor $\rho$, ensuring convergence under weaker assumptions. The work broadens RL-based feedback-control guarantees to discontinuous or weakly contractive mappings, offering theoretical convergence guarantees in high-dimensional policy settings.

Abstract

We apply methods of the fixed point theory to a Lambda policy iteration with a randomization algorithm for weak contractions mappings. This type of mappings covers a broader range than the strong contractions typically considered in the literature, such as Ćirić contraction. Specifically, we explore the characteristics of reinforcement learning procedures developed for feedback control within the context of fixed point theory. Under relatively general assumptions, we identify the sufficient conditions for convergence with a probability of one in infinite-dimensional policy spaces.

Fixed Point Theory Analysis of a Lambda Policy Iteration with Randomization for the Ćirić Contraction Operator

TL;DR

, and the λ-operator

, proves existence and uniqueness of fixed points

and

, and establishes almost-sure convergence of the λ-PIR scheme in infinite-dimensional policy spaces. Key results include

and a contraction-like bound for

with contraction factor

, ensuring convergence under weaker assumptions. The work broadens RL-based feedback-control guarantees to discontinuous or weakly contractive mappings, offering theoretical convergence guarantees in high-dimensional policy settings.

Abstract

Paper Structure (6 sections, 8 theorems, 37 equations)

This paper contains 6 sections, 8 theorems, 37 equations.

Introduction
Preliminaries and Auxiliary Results
Main Results
$\lambda$-Policy Iteration With Randomization
Convergence Of The $\lambda$-PIR Algorithm
Conclusion

Key Result

lemma 1

$B(X)$ is complete with respect to the topology induced by $\| \cdot \|$.

Theorems & Definitions (14)

lemma 1
definition 1
theorem 3
lemma 2
proposition 1
proof
remark 1
proposition 2
proof
theorem 6
...and 4 more

Fixed Point Theory Analysis of a Lambda Policy Iteration with Randomization for the Ćirić Contraction Operator

TL;DR

Abstract

Fixed Point Theory Analysis of a Lambda Policy Iteration with Randomization for the Ćirić Contraction Operator

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (14)