Table of Contents
Fetching ...

Absorbing Markov Decision Processes

François Dufour, Tomás Prieto-Rumeau

TL;DR

This paper analyzes discrete-time absorbing Markov Decision Processes with general state spaces and Borel action spaces, focusing on the relationship between the characteristic equation and occupation measures. It introduces phantom measures and establishes necessary and sufficient conditions to guarantee that every solution to the characteristic equation is an occupation measure, with a refined result under continuity-compactness (Condition (S)) that a measure is an occupation measure precisely when it satisfies the characteristic equation and an absolute-continuity condition with respect to a reference measure $\\eta^{\\beta}$. The authors define and leverage the notion of a uniformly absorbing model to prove that the set of occupation measures is compact in the weak-strong topology if and only if the model is uniformly absorbing, highlighting that mere absorption does not ensure compactness. The work also connects these structural results to constrained optimization via linear programming and provides concrete examples illustrating phantom measures and the necessity of uniform absorbency, with the discounted case recovering classical compactness results.

Abstract

In this paper, we study discrete-time absorbing Markov Decision Processes (MDP) with measurable state space and Borel action space with a given initial distribution. For such models, solutions to the characteristic equation that are not occupation measures may exist. Several necessary and sufficient conditions are provided to guarantee that any solution to the characteristic equation is an occupation measure. Under the so-called continuity-compactness conditions, it is shown that the set of occupation measures is compact in the weak-strong topology if and only if the model is uniformly absorbing. Finally, it is shown that the occupation measures are characterized by the characteristic equation and an additional condition. Several examples are provided to illustrate our results.

Absorbing Markov Decision Processes

TL;DR

This paper analyzes discrete-time absorbing Markov Decision Processes with general state spaces and Borel action spaces, focusing on the relationship between the characteristic equation and occupation measures. It introduces phantom measures and establishes necessary and sufficient conditions to guarantee that every solution to the characteristic equation is an occupation measure, with a refined result under continuity-compactness (Condition (S)) that a measure is an occupation measure precisely when it satisfies the characteristic equation and an absolute-continuity condition with respect to a reference measure . The authors define and leverage the notion of a uniformly absorbing model to prove that the set of occupation measures is compact in the weak-strong topology if and only if the model is uniformly absorbing, highlighting that mere absorption does not ensure compactness. The work also connects these structural results to constrained optimization via linear programming and provides concrete examples illustrating phantom measures and the necessity of uniform absorbency, with the discounted case recovering classical compactness results.

Abstract

In this paper, we study discrete-time absorbing Markov Decision Processes (MDP) with measurable state space and Borel action space with a given initial distribution. For such models, solutions to the characteristic equation that are not occupation measures may exist. Several necessary and sufficient conditions are provided to guarantee that any solution to the characteristic equation is an occupation measure. Under the so-called continuity-compactness conditions, it is shown that the set of occupation measures is compact in the weak-strong topology if and only if the model is uniformly absorbing. Finally, it is shown that the occupation measures are characterized by the characteristic equation and an additional condition. Several examples are provided to illustrate our results.
Paper Structure (6 sections, 18 theorems, 82 equations)

This paper contains 6 sections, 18 theorems, 82 equations.

Key Result

Lemma 1.1

Let $(\mathbf{\Omega},\mathcal{F})$ be a measurable space and let $\mathbf{S}$ be a Polish space. Let $\Phi:\mathbf{\Omega}\to 2^{\mathbf{S}}$ be a weakly measurable correspondence with nonempty closed values, and let $\mathbf{K}$ be the graph of the correspondence. For every $\mu\in\boldsymbol{\mat and such that $Q(\Phi(\omega)|\omega)=1$ for each $\omega\in\mathbf{\Omega}$. Moreover, $Q$ is uniq

Theorems & Definitions (28)

  • Lemma 1.1: Disintegration lemma
  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Lemma 2.4
  • Definition 3.1
  • Lemma 3.2
  • Proposition 3.3
  • Definition 3.4
  • Example 3.5
  • ...and 18 more