On the Feinberg-Piunovskiy Theorem and its extension to chattering policies

Francois Dufour; Tomas Prieto-Rumeau

On the Feinberg-Piunovskiy Theorem and its extension to chattering policies

Francois Dufour, Tomas Prieto-Rumeau

TL;DR

This work extends the Feinberg-Piunovskiy theorem to general state spaces by analyzing the geometry of occupation measures in uniformly absorbing Markov decision processes. It shows that constrained optimality can be achieved using chattering stationary policies of order $d+1$, and, when the model is atomless, that deterministic stationary policies suffice. The authors develop two distinct proofs and leverage results on extreme points, disintegration, and Young measures to relate chattering, deterministic Markov, and deterministic stationary policies. The findings provide a robust, general framework for policy sufficiency in constrained and absorbing MDPs, with potential implications for both theory and applications in areas like game theory and stochastic control.

Abstract

The Feinberg-Piunovskiy Theorem established in [14, Theorem 3.8] asserts that for a discrete-time uniformly absorbing and atomless Markov Decision Process (MDP) with Borel state space and multiple criteria, the family of deterministic stationary policies is a sufficient class of policies. In this paper, we study some related problems and some extensions. In particular dropping the atomless hypothesis, we establish that the set of chattering stationary policies is a sufficient class of policies for uniformly absorbing MDPs with measurable state space and multiple criteria. We also prove the Feinberg-Piunovskiy Theorem in the context of a measurable state space in two different ways that differ from \cite{piunovskiy19}. In particular, we show that the sufficiency of chattering stationary policies directly yields the sufficiency of deterministics stationary policies for atomless models. Our approach is partially based on the analysis of extreme points of certain convex sets of occupation measures satisfying integral type constraints. We show that for a uniformly absorbing model an extreme point of such sets is necessarily given by occupation measures induced by chattering stationary policies of order $d+1$ where $d$ is the dimension of the vector of constraints. When in addition the model $\mathsf{M}$ is atomless, then the extreme points of this constrained set of occupation measures are precisely the occupation measures generated by deterministic stationary policies satisfying these constraints.

On the Feinberg-Piunovskiy Theorem and its extension to chattering policies

TL;DR

Abstract

On the Feinberg-Piunovskiy Theorem and its extension to chattering policies

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (49)