Table of Contents
Fetching ...

Drift Control of High-Dimensional RBM: A Computational Method Based on Neural Networks

Baris Ata, J. Michael Harrison, Nian Si

TL;DR

This work develops a simulation-based, deep learning approach to drift control for high-dimensional reflected Brownian motions in the positive orthant, framed as heavy-traffic queueing network control problems. It derives two key stochastic identities that link the control problem to pathwise SDE representations, enabling a double-parameterization where a neural network approximates the value function and its gradient. The methodology solves both discounted and ergodic formulations by training networks on discretized Skorokhod-problem paths and extracting bang-bang or affine-rate policies, achieving tight accuracy (within a fraction of a percent) up to dimension 30 (and up to 100 in some tests) across three problem families, including linear and quadratic costs and parallel-server networks. The results demonstrate scalable, high-dimensional RBM control with practical performance comparable to tailored benchmarks, and point to future directions in singular control and broader state-space generalizations.

Abstract

Motivated by applications in queueing theory, we consider a stochastic control problem whose state space is the $d$-dimensional positive orthant. The controlled process $Z$ evolves as a reflected Brownian motion whose covariance matrix is exogenously specified, as are its directions of reflection from the orthant's boundary surfaces. A system manager chooses a drift vector $θ(t)$ at each time $t$ based on the history of $Z$, and the cost rate at time $t$ depends on both $Z(t)$ and $θ(t)$. In our initial problem formulation, the objective is to minimize expected discounted cost over an infinite planning horizon, after which we treat the corresponding ergodic control problem. Extending earlier work by Han et al. (Proceedings of the National Academy of Sciences, 2018, 8505-8510), we develop and illustrate a simulation-based computational method that relies heavily on deep neural network technology. For test problems studied thus far, our method is accurate to within a fraction of one percent, and is computationally feasible in dimensions up to at least $d=30$.

Drift Control of High-Dimensional RBM: A Computational Method Based on Neural Networks

TL;DR

This work develops a simulation-based, deep learning approach to drift control for high-dimensional reflected Brownian motions in the positive orthant, framed as heavy-traffic queueing network control problems. It derives two key stochastic identities that link the control problem to pathwise SDE representations, enabling a double-parameterization where a neural network approximates the value function and its gradient. The methodology solves both discounted and ergodic formulations by training networks on discretized Skorokhod-problem paths and extracting bang-bang or affine-rate policies, achieving tight accuracy (within a fraction of a percent) up to dimension 30 (and up to 100 in some tests) across three problem families, including linear and quadratic costs and parallel-server networks. The results demonstrate scalable, high-dimensional RBM control with practical performance comparable to tailored benchmarks, and point to future directions in singular control and broader state-space generalizations.

Abstract

Motivated by applications in queueing theory, we consider a stochastic control problem whose state space is the -dimensional positive orthant. The controlled process evolves as a reflected Brownian motion whose covariance matrix is exogenously specified, as are its directions of reflection from the orthant's boundary surfaces. A system manager chooses a drift vector at each time based on the history of , and the cost rate at time depends on both and . In our initial problem formulation, the objective is to minimize expected discounted cost over an infinite planning horizon, after which we treat the corresponding ergodic control problem. Extending earlier work by Han et al. (Proceedings of the National Academy of Sciences, 2018, 8505-8510), we develop and illustrate a simulation-based computational method that relies heavily on deep neural network technology. For test problems studied thus far, our method is accurate to within a fraction of one percent, and is computationally feasible in dimensions up to at least .
Paper Structure (35 sections, 9 theorems, 204 equations, 8 figures, 13 tables, 4 algorithms)

This paper contains 35 sections, 9 theorems, 204 equations, 8 figures, 13 tables, 4 algorithms.

Key Result

Proposition 1

Under any policy $u$ and for any integer $n=1,2,\ldots$ the function has polynomial growth in $t$ for each fixed $z \in \mathbb{R}_+^d$.

Figures (8)

  • Figure 1: A feedforward queueing network with thin arrival streams.
  • Figure 2: A decomposable parallel-server queueing network.
  • Figure 3: Comparison between the derivative $G_w(\cdot)$ learned from neural networks and the derivative of the optimal value function for the case of $d=1$ and $r=0.1$. The dotted line indicates the cost $c_0=1$. When the value function gradient is above this dotted line, the optimal control is $\theta = b$, and otherwise it is $\theta = 0$.
  • Figure 4: Graphical representation of the policy learned from neural networks and the benchmark policy for the case $b=2,d=2$ and $r=0.1$
  • Figure 5: Graphical representation of the policy learned from neural networks and the benchmark policy for the case $b=10,d=2$ and $r=0.1$
  • ...and 3 more figures

Theorems & Definitions (20)

  • Proposition 1
  • Proposition 2
  • proof
  • Proposition 3
  • Remark 1
  • proof : Proof Sketch for Proposition \ref{['thm:doublepara:dis']}
  • Proposition 4
  • proof
  • Proposition 5
  • proof : Proof Sketch
  • ...and 10 more