Initial Distribution Sensitivity of Constrained Markov Decision Processes

Alperen Tercan; Necmiye Ozay

Initial Distribution Sensitivity of Constrained Markov Decision Processes

Alperen Tercan, Necmiye Ozay

TL;DR

This work addresses how CMDP performance depends on the initial state distribution by developing three bounds on $V^*(\beta)$ using $\text{LP}$ duality, perturbation theory, and concavity of LP values. The bounds enable assessing robustness and deriving inner approximations to $(0,\epsilon)$-regret sets without re-solving the CMDP for every $\beta$, and they are validated on random CMDPs and a water-pendulum example. The key contributions include a practical duality-based bound, a perturbation-based bound with both upper and lower guarantees, and a concavity-based bound, plus demonstrations of robustness analysis and minimal-regret computation over distribution sets. These results offer efficient tools for planning under initial-distribution uncertainty and for designing policies that remain near-optimal as $\beta$ varies, with potential extensions to uncertain transition dynamics.

Abstract

Constrained Markov Decision Processes (CMDPs) are notably more complex to solve than standard MDPs due to the absence of universally optimal policies across all initial state distributions. This necessitates re-solving the CMDP whenever the initial distribution changes. In this work, we analyze how the optimal value of CMDPs varies with different initial distributions, deriving bounds on these variations using duality analysis of CMDPs and perturbation analysis in linear programming. Moreover, we show how such bounds can be used to analyze the regret of a given policy due to unknown variations of the initial distribution.

Initial Distribution Sensitivity of Constrained Markov Decision Processes

TL;DR

This work addresses how CMDP performance depends on the initial state distribution by developing three bounds on

using

duality, perturbation theory, and concavity of LP values. The bounds enable assessing robustness and deriving inner approximations to

-regret sets without re-solving the CMDP for every

, and they are validated on random CMDPs and a water-pendulum example. The key contributions include a practical duality-based bound, a perturbation-based bound with both upper and lower guarantees, and a concavity-based bound, plus demonstrations of robustness analysis and minimal-regret computation over distribution sets. These results offer efficient tools for planning under initial-distribution uncertainty and for designing policies that remain near-optimal as

varies, with potential extensions to uncertain transition dynamics.

Initial Distribution Sensitivity of Constrained Markov Decision Processes

TL;DR

Abstract

Initial Distribution Sensitivity of Constrained Markov Decision Processes

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (16)