Markov Decision Process Design: A Framework for Integrating Strategic and Operational Decisions

Seth Brown; Saumya Sinha; Andrew J Schaefer

Markov Decision Process Design: A Framework for Integrating Strategic and Operational Decisions

Seth Brown, Saumya Sinha, Andrew J Schaefer

TL;DR

The paper addresses optimal design of systems intended for repeated use under uncertainty by integrating a static design decision with a dynamic operational policy. It develops a bilevel linear MIP framework where the leader selects design variables $\mathbf{x}$ and the followers solve $|\mathcal{K}|$ LPs representing infinite-horizon discounted-cost MDPs, with costs affine in $\mathbf{x}$. By replacing follower value functions with variables and using dualization plus big-M, the authors provide a single-level MIP reformulation that can be solved with off-the-shelf solvers, while also outlining the conditions under which the bilevel formulation is preferable. The approach is demonstrated across reliability, inventory management, and queue design problems, with a numerical study showing feasibility for realistically sized instances and highlighting how complexity scales with MDP size, scenarios, and leader variables. This framework enables integrated strategic and operational decision making with potential efficiency and profitability gains in practice.

Abstract

We consider the problem of optimally designing a system for repeated use under uncertainty. We develop a modeling framework that integrates design and operational phases, which are represented by a mixed-integer program and discounted-cost infinite-horizon Markov decision processes, respectively. We seek to simultaneously minimize the design costs and the subsequent expected operational costs. This problem setting arises naturally in several application areas, as we illustrate through examples. We derive a bilevel mixed-integer linear programming formulation for the problem and perform a computational study to demonstrate that realistic instances can be solved numerically.

Markov Decision Process Design: A Framework for Integrating Strategic and Operational Decisions

TL;DR

and the followers solve

LPs representing infinite-horizon discounted-cost MDPs, with costs affine in

. By replacing follower value functions with variables and using dualization plus big-M, the authors provide a single-level MIP reformulation that can be solved with off-the-shelf solvers, while also outlining the conditions under which the bilevel formulation is preferable. The approach is demonstrated across reliability, inventory management, and queue design problems, with a numerical study showing feasibility for realistically sized instances and highlighting how complexity scales with MDP size, scenarios, and leader variables. This framework enables integrated strategic and operational decision making with potential efficiency and profitability gains in practice.

Abstract

Paper Structure (12 sections, 11 equations, 1 figure, 2 tables)

This paper contains 12 sections, 11 equations, 1 figure, 2 tables.

Introduction
MDP Design: An Integrated Framework for Design and Operations
Design Problem
Operational Problem
Bilevel Programming Formulation
MIP Formulation
Applications
Reliability
Inventory management
Queue Design and Control
Numerical Results
Conclusion

Figures (1)

Figure 1: Trends in average solve time (in seconds) over 100 instances upon varying the number of leader variables, leader constraints, scenarios, MDP states and MDP actions per state.

Markov Decision Process Design: A Framework for Integrating Strategic and Operational Decisions

TL;DR

Abstract

Markov Decision Process Design: A Framework for Integrating Strategic and Operational Decisions

Authors

TL;DR

Abstract

Table of Contents

Figures (1)