Table of Contents
Fetching ...

Average-Cost MDPs with Infinite State and Action Sets: New Sufficient Conditions for Optimality Inequalities and Equations

Eugene A. Feinberg, Pavlo O. Kasyanov, Liliia S. Paliichuk

TL;DR

This work advances average-cost MDP theory for infinite-state and -action spaces by proving that optimality inequalities and equations can hold under weaker continuity and compactness assumptions than the classical Assumption B. The authors introduce W* and S* as weak forms of transition continuity, define a weakened B condition via sequences ${\alpha_n}$, and show WACOI and ACOE hold with corresponding relative value functions $u$ derived from discounted costs. These results guarantee deterministic optimal policies even when costs or transitions lack strong continuity, and they connect discounted limits to average-cost criteria through corollaries. The findings broaden applicability to problems with noncompact action sets and discontinuous costs, impacting inventory control, POMDPs, and related infinite-horizon MDP settings.

Abstract

This paper studies discrete-time average-cost infinite-horizon Markov decision processes (MDPs) with Borel state and action sets. It introduces new sufficient conditions for { the} validity of optimality inequalities and optimality equations for MDPs with weakly and setwise continuous transition probabilities. These inequalities and equations imply the existence of deterministic optimal policies.

Average-Cost MDPs with Infinite State and Action Sets: New Sufficient Conditions for Optimality Inequalities and Equations

TL;DR

This work advances average-cost MDP theory for infinite-state and -action spaces by proving that optimality inequalities and equations can hold under weaker continuity and compactness assumptions than the classical Assumption B. The authors introduce W* and S* as weak forms of transition continuity, define a weakened B condition via sequences , and show WACOI and ACOE hold with corresponding relative value functions derived from discounted costs. These results guarantee deterministic optimal policies even when costs or transitions lack strong continuity, and they connect discounted limits to average-cost criteria through corollaries. The findings broaden applicability to problems with noncompact action sets and discontinuous costs, impacting inventory control, POMDPs, and related infinite-horizon MDP settings.

Abstract

This paper studies discrete-time average-cost infinite-horizon Markov decision processes (MDPs) with Borel state and action sets. It introduces new sufficient conditions for { the} validity of optimality inequalities and optimality equations for MDPs with weakly and setwise continuous transition probabilities. These inequalities and equations imply the existence of deterministic optimal policies.

Paper Structure

This paper contains 3 sections, 8 theorems, 44 equations.

Key Result

Theorem 3.1

Let Assumption B(i) hold and $\{ \alpha_n\uparrow 1 \}_{n\in\mathbb{N}^*}$ be an arbitrary fixed sequence. If there exists a measurable function $u:\mathbb{X}\to [0,+\infty)$ and a deterministic policy $\phi$ such that then $\phi$ is average-cost optimal, for each $x\in \mathbb{X},$ and WACOI eqn:ACOIOver hold for the same policy $\phi$ and function $u$ as in eq7111.

Theorems & Definitions (13)

  • Definition 2.1: Feinberg et al. FKZ13, Feinberg Ftut
  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Corollary 3.4
  • Definition 3.5: Semi-equicontinuity FKL18a
  • Definition 3.6: Equicontinuity
  • Corollary 3.7
  • Theorem 3.8
  • Example 3.9
  • ...and 3 more