A dynamical neural network approach for distributionally robust chance constrained Markov decision process

Tian Xia; Jia Liu; Zhiping Chen

A dynamical neural network approach for distributionally robust chance constrained Markov decision process

Tian Xia, Jia Liu, Zhiping Chen

TL;DR

The paper tackles distributionally robust joint chance constrained MDPs under moment-based ambiguity by deriving a deterministic reformulation via a logarithmic transformation, resulting in a bi-convex problem in variables $\tau$ and $h$. It then proposes a dynamical neural network (DNN) approach whose equilibrium corresponds to a KKT point and proves global stability via a Lyapunov analysis. Compared to a sequential convex approximation (SCA) method, the DNN shows comparable optimality while offering time-continuous convergence and stronger out-of-sample robustness in a machine replacement case. This work advances scalable, robust decision-making under distributional uncertainty and provides a blueprint for applying DNN solvers to nonconvex DRO-CCMDPs. The approach is poised to extend to broader ambiguity sets and joint distributions beyond the current moments-based framework.

Abstract

In this paper, we study the distributionally robust joint chance constrained Markov decision process. {Utilizing the logarithmic transformation technique,} we derive its deterministic reformulation with bi-convex terms under the moment-based uncertainty set. To cope with the non-convexity and improve the robustness of the solution, we propose a dynamical neural network approach to solve the reformulated optimization problem. Numerical results on a machine replacement problem demonstrate the efficiency of the proposed dynamical neural network approach when compared with the sequential convex approximation approach.

A dynamical neural network approach for distributionally robust chance constrained Markov decision process

TL;DR

and

. It then proposes a dynamical neural network (DNN) approach whose equilibrium corresponds to a KKT point and proves global stability via a Lyapunov analysis. Compared to a sequential convex approximation (SCA) method, the DNN shows comparable optimality while offering time-continuous convergence and stronger out-of-sample robustness in a machine replacement case. This work advances scalable, robust decision-making under distributional uncertainty and provides a blueprint for applying DNN solvers to nonconvex DRO-CCMDPs. The approach is poised to extend to broader ambiguity sets and joint distributions beyond the current moments-based framework.

Abstract

Paper Structure (19 sections, 10 theorems, 51 equations, 12 figures, 2 tables, 1 algorithm)

This paper contains 19 sections, 10 theorems, 51 equations, 12 figures, 2 tables, 1 algorithm.

Introduction
Distributionally robust chance constrained MDP
MDP
Distributionally robust chance constrained MDP
moment based J-DRCCMDP
Reformulation of moment based J-DRCCMDP
Sequential convex approximation algorithm for J-DRCCMDP
Dynamical neural network approach for J-DRCCMDP
KKT conditions
Neural network model
Stability analysis
Numerical experiments
Experimental setup
Optimal policy
Convergence quality
...and 4 more sections

Key Result

lemma thmcounterlemma

The set of occupation measures corresponding to history dependent policies is equal to the set where $\delta(s',s)$ is the Kronecker delta, such that the expected discounted value function defined by optim MDP remains invariant to time $t$.

Figures (12)

Figure 1: Flowchart of the DNN approach for solving J-DRCCMDP
Figure 2: The structure of DNN model
Figure 3: The transition probabilities for the MDP
Figure 4: $\tau(1,a_1),\tau(1,a_2)$
Figure 5: $\tau(2,a_1),\tau(2,a_2)$
...and 7 more figures

Theorems & Definitions (21)

lemma thmcounterlemma: varagapriya2022constrainedaltman1999constrained
remark thmcounterremark
proposition thmcounterproposition
proof
remark thmcounterremark
lemma thmcounterlemma
proof
definition thmcounterdefinition
lemma thmcounterlemma: jiang2021partial
lemma thmcounterlemma: jiang2021partial
...and 11 more

A dynamical neural network approach for distributionally robust chance constrained Markov decision process

TL;DR

Abstract

A dynamical neural network approach for distributionally robust chance constrained Markov decision process

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (21)