Table of Contents
Fetching ...

Transition Uncertainties in Constrained Markov Decision Models: A Robust Optimization Approach

V Varagapriya

TL;DR

This work addresses CMDPs under uncertain transition probabilities by introducing a robust optimization framework with a generalized uncertainty set that blends polyhedral and second-order cone constraints. It proves that inner robust optimization problems can be reformulated as SOCPs, and, via strong duality, the overall robust CMDP can be represented as an SOCP with bilinear constraints, enabling solution with standard solvers. The approach is demonstrated on a machine replacement problem, showing how optimal values and policies respond to the uncertainty region and problem size, while also revealing computational challenges as the state space expands. Overall, the paper broadens robust CMDP modeling beyond rank-one uncertainty and provides a practical pathway to compute policies under complex transition-uncertainty structures.

Abstract

We examine a constrained Markov decision process under uncertain transition probabilities, with the uncertainty modeled as deviations from observed transition probabilities. We construct the uncertainty set associated with the deviations using polyhedral and second-order cone constraints and employ a robust optimization framework. We demonstrate that each inner optimization problem of the robust model can be equivalently transformed into a second-order cone programming problem. Using strong duality arguments, we show that the resulting robust problem can be equivalently reformulated into a second-order cone programming problem with bilinear constraints. In the numerical experiments, we study a machine replacement problem and explore potential sources of uncertainty in the transition probabilities. We examine how the optimal values and solutions differ as we vary the feasible region of the uncertainty set, considering only polyhedral constraints and a combination of polyhedral and second-order cone constraints. Furthermore, we analyze the impact of the number of states, the discount factor, and variations in the feasible region of the uncertainty set on the optimal values.

Transition Uncertainties in Constrained Markov Decision Models: A Robust Optimization Approach

TL;DR

This work addresses CMDPs under uncertain transition probabilities by introducing a robust optimization framework with a generalized uncertainty set that blends polyhedral and second-order cone constraints. It proves that inner robust optimization problems can be reformulated as SOCPs, and, via strong duality, the overall robust CMDP can be represented as an SOCP with bilinear constraints, enabling solution with standard solvers. The approach is demonstrated on a machine replacement problem, showing how optimal values and policies respond to the uncertainty region and problem size, while also revealing computational challenges as the state space expands. Overall, the paper broadens robust CMDP modeling beyond rank-one uncertainty and provides a practical pathway to compute policies under complex transition-uncertainty structures.

Abstract

We examine a constrained Markov decision process under uncertain transition probabilities, with the uncertainty modeled as deviations from observed transition probabilities. We construct the uncertainty set associated with the deviations using polyhedral and second-order cone constraints and employ a robust optimization framework. We demonstrate that each inner optimization problem of the robust model can be equivalently transformed into a second-order cone programming problem. Using strong duality arguments, we show that the resulting robust problem can be equivalently reformulated into a second-order cone programming problem with bilinear constraints. In the numerical experiments, we study a machine replacement problem and explore potential sources of uncertainty in the transition probabilities. We examine how the optimal values and solutions differ as we vary the feasible region of the uncertainty set, considering only polyhedral constraints and a combination of polyhedral and second-order cone constraints. Furthermore, we analyze the impact of the number of states, the discount factor, and variations in the feasible region of the uncertainty set on the optimal values.

Paper Structure

This paper contains 7 sections, 7 theorems, 27 equations, 1 figure.

Key Result

Lemma 1

Let assumption_positive_gamma hold true. For a fixed $f \in F_S$, Inner_opt_with_c_combined is equivalent to the following SOCP problem. where $\bar{P}_f$ is a $\vert S \vert \times \vert S \vert$-dimensional matrix whose component associated with a transition from a state $s$ to $s'$ is defined by $\sum_{a \in A(s) } f(s, a) \bar{p} (s' \vert s,a)$. Additionally, $\mathfrak{z}_c = (\mathfrak{z}_

Figures (1)

  • Figure 1: Lower and upper bounds of optimal values.

Theorems & Definitions (13)

  • Lemma 1
  • proof
  • Lemma 2
  • Theorem 1
  • proof
  • Remark 1
  • Lemma 3
  • proof
  • Lemma 4
  • Lemma 5
  • ...and 3 more