Markov Decision Process Design: A Framework for Integrating Strategic and Operational Decisions
Seth Brown, Saumya Sinha, Andrew J Schaefer
TL;DR
The paper addresses optimal design of systems intended for repeated use under uncertainty by integrating a static design decision with a dynamic operational policy. It develops a bilevel linear MIP framework where the leader selects design variables $\mathbf{x}$ and the followers solve $|\mathcal{K}|$ LPs representing infinite-horizon discounted-cost MDPs, with costs affine in $\mathbf{x}$. By replacing follower value functions with variables and using dualization plus big-M, the authors provide a single-level MIP reformulation that can be solved with off-the-shelf solvers, while also outlining the conditions under which the bilevel formulation is preferable. The approach is demonstrated across reliability, inventory management, and queue design problems, with a numerical study showing feasibility for realistically sized instances and highlighting how complexity scales with MDP size, scenarios, and leader variables. This framework enables integrated strategic and operational decision making with potential efficiency and profitability gains in practice.
Abstract
We consider the problem of optimally designing a system for repeated use under uncertainty. We develop a modeling framework that integrates design and operational phases, which are represented by a mixed-integer program and discounted-cost infinite-horizon Markov decision processes, respectively. We seek to simultaneously minimize the design costs and the subsequent expected operational costs. This problem setting arises naturally in several application areas, as we illustrate through examples. We derive a bilevel mixed-integer linear programming formulation for the problem and perform a computational study to demonstrate that realistic instances can be solved numerically.
