Average-Cost MDPs with Infinite State and Action Sets: New Sufficient Conditions for Optimality Inequalities and Equations
Eugene A. Feinberg, Pavlo O. Kasyanov, Liliia S. Paliichuk
TL;DR
This work advances average-cost MDP theory for infinite-state and -action spaces by proving that optimality inequalities and equations can hold under weaker continuity and compactness assumptions than the classical Assumption B. The authors introduce W* and S* as weak forms of transition continuity, define a weakened B condition via sequences ${\alpha_n}$, and show WACOI and ACOE hold with corresponding relative value functions $u$ derived from discounted costs. These results guarantee deterministic optimal policies even when costs or transitions lack strong continuity, and they connect discounted limits to average-cost criteria through corollaries. The findings broaden applicability to problems with noncompact action sets and discontinuous costs, impacting inventory control, POMDPs, and related infinite-horizon MDP settings.
Abstract
This paper studies discrete-time average-cost infinite-horizon Markov decision processes (MDPs) with Borel state and action sets. It introduces new sufficient conditions for { the} validity of optimality inequalities and optimality equations for MDPs with weakly and setwise continuous transition probabilities. These inequalities and equations imply the existence of deterministic optimal policies.
