Table of Contents
Fetching ...

Fixed Point Theory Analysis of a Lambda Policy Iteration with Randomization for the Ćirić Contraction Operator

Abdelkader Belhenniche, Roman Chertovskih

TL;DR

This paper extends Lambda policy iteration with randomization to the broader class of Ćirić contractions within a fixed-point framework, moving beyond standard Banach contractions to weak contractions. It defines the operators $F_\mu$, $F$, and the λ-operator $F_\mu^{\lambda}$, proves existence and uniqueness of fixed points $V^{*}$ and $V_{\mu}$, and establishes almost-sure convergence of the λ-PIR scheme in infinite-dimensional policy spaces. Key results include $V^{*}=\inf_{\mu} V_{\mu}$ and a contraction-like bound for $F_\mu^{\lambda}$ with contraction factor $\rho$, ensuring convergence under weaker assumptions. The work broadens RL-based feedback-control guarantees to discontinuous or weakly contractive mappings, offering theoretical convergence guarantees in high-dimensional policy settings.

Abstract

We apply methods of the fixed point theory to a Lambda policy iteration with a randomization algorithm for weak contractions mappings. This type of mappings covers a broader range than the strong contractions typically considered in the literature, such as Ćirić contraction. Specifically, we explore the characteristics of reinforcement learning procedures developed for feedback control within the context of fixed point theory. Under relatively general assumptions, we identify the sufficient conditions for convergence with a probability of one in infinite-dimensional policy spaces.

Fixed Point Theory Analysis of a Lambda Policy Iteration with Randomization for the Ćirić Contraction Operator

TL;DR

This paper extends Lambda policy iteration with randomization to the broader class of Ćirić contractions within a fixed-point framework, moving beyond standard Banach contractions to weak contractions. It defines the operators , , and the λ-operator , proves existence and uniqueness of fixed points and , and establishes almost-sure convergence of the λ-PIR scheme in infinite-dimensional policy spaces. Key results include and a contraction-like bound for with contraction factor , ensuring convergence under weaker assumptions. The work broadens RL-based feedback-control guarantees to discontinuous or weakly contractive mappings, offering theoretical convergence guarantees in high-dimensional policy settings.

Abstract

We apply methods of the fixed point theory to a Lambda policy iteration with a randomization algorithm for weak contractions mappings. This type of mappings covers a broader range than the strong contractions typically considered in the literature, such as Ćirić contraction. Specifically, we explore the characteristics of reinforcement learning procedures developed for feedback control within the context of fixed point theory. Under relatively general assumptions, we identify the sufficient conditions for convergence with a probability of one in infinite-dimensional policy spaces.
Paper Structure (6 sections, 8 theorems, 37 equations)

This paper contains 6 sections, 8 theorems, 37 equations.

Key Result

lemma 1

$B(X)$ is complete with respect to the topology induced by $\| \cdot \|$.

Theorems & Definitions (14)

  • lemma 1
  • definition 1
  • theorem 3
  • lemma 2
  • proposition 1
  • proof
  • remark 1
  • proposition 2
  • proof
  • theorem 6
  • ...and 4 more