Table of Contents
Fetching ...

Statistical Learning for Probability-Constrained Stochastic Optimal Control

Alessandro Balata, Michael Ludkovski, Aditya Maheshwari, Jan Palczewski

TL;DR

The results indicate that using logistic or Gaussian process regression to estimate the admissibility probability outperforms the other options and offers an efficient and reliable extension of RMC to probability-constrained control.

Abstract

We investigate Monte Carlo based algorithms for solving stochastic control problems with probabilistic constraints. Our motivation comes from microgrid management, where the controller tries to optimally dispatch a diesel generator while maintaining low probability of blackouts. The key question we investigate are empirical simulation procedures for learning the admissible control set that is specified implicitly through a probability constraint on the system state. We propose a variety of relevant statistical tools including logistic regression, Gaussian process regression, quantile regression and support vector machines, which we then incorporate into an overall Regression Monte Carlo (RMC) framework for approximate dynamic programming. Our results indicate that using logistic or Gaussian process regression to estimate the admissibility probability outperforms the other options. Our algorithms offer an efficient and reliable extension of RMC to probability-constrained control. We illustrate our findings with two case studies for the microgrid problem.

Statistical Learning for Probability-Constrained Stochastic Optimal Control

TL;DR

The results indicate that using logistic or Gaussian process regression to estimate the admissibility probability outperforms the other options and offers an efficient and reliable extension of RMC to probability-constrained control.

Abstract

We investigate Monte Carlo based algorithms for solving stochastic control problems with probabilistic constraints. Our motivation comes from microgrid management, where the controller tries to optimally dispatch a diesel generator while maintaining low probability of blackouts. The key question we investigate are empirical simulation procedures for learning the admissible control set that is specified implicitly through a probability constraint on the system state. We propose a variety of relevant statistical tools including logistic regression, Gaussian process regression, quantile regression and support vector machines, which we then incorporate into an overall Regression Monte Carlo (RMC) framework for approximate dynamic programming. Our results indicate that using logistic or Gaussian process regression to estimate the admissibility probability outperforms the other options. Our algorithms offer an efficient and reliable extension of RMC to probability-constrained control. We illustrate our findings with two case studies for the microgrid problem.

Paper Structure

This paper contains 28 sections, 66 equations, 6 figures, 7 tables, 1 algorithm.

Figures (6)

  • Figure 1: Left panel: Microgrid topology: the load, the diesel generator, the battery and the renewables. Right: Contour plot for minimum admissible diesel output $(L,I,C) \mapsto u^{\min}_n(L,I,C)$ (see Remark \ref{['def:umin']}). For $L < 0$, the constraint is not binding and $u^{\min}_n(L,I,C) = 0$. As demand increases, the constraint becomes more stringent, i.e. $u^{\min}_n(L,I,C)$ increases in $L$. Red curve represents a path of the controlled demand-inventory pair $(L^{u^*}_n, I^{u^*}_n, C^{u^*}_n)$ following a myopic strategy choosing the minimum admissible control $u_n(L_n,I_n,C_n) = u^{\min}_n(L_n,I_n,C_n)$. The regime $C$ can be visualised by observing when the red line crosses on the R.H.S. of the first contour line, indicating the the diesel generator should be turned on.
  • Figure 1: Training data and fitted models for the methods of Section \ref{['sec:admissibleSetEstimation']} at $u=0$. Top row: probability estimation schemes, bottom row: quantile estimation schemes. Top/left panel: Training set $\{L^i,I^i,y^i \}_{i=1}^{M_a}$ for the LR model, color-coded according to the value of $y^i \in \{0,1\}$, along with the estimated contours for $\hat{p}_{LR}(L,I)$ at levels $\{1\%, 5\%, 10\% \}$. Top/center: Training set $\{L^i, I^i, \bar{p}^i \}_{i=1}^{M_a}$ color-coded according to $\bar{p}^i$ for GPR along with the contour $\{\hat{p}_{GPR}(L,I) = 5\%\}$. Top/right: parametric density fitting at $L_0 = 5.5, I_0 = 1.48$ and $u \in \{0,1\}$. We show the empirical and fitted inverse cdf $\mathbb{P}(G' > g)$ based on a truncated Gaussian distribution. Bottom/left: Training set $\{L^i,I^i,y^i \}_{i=1}^{M_a}$ for SVM (color-coded according to $y^i \in \{-1,1\}$) and the decision boundary in red. Bottom/center: Training set $\{L^i,I^i,\bar{q}^i \}_{i=1}^{M_a}$ color-coded according to $\bar{q}^i$ for EP and the contour $\{\hat{q} = 0\}$. Bottom/right: Training set $\{L^i,I^i,g^i \}_{i=1}^{M_a}$ color-coded according to $g^i$ for QR along with the contour $\{\hat{q}_{QR}(L,I) = 0\}$. All models share the same ground truth, so the red contours are identical up to model-specific estimation errors.
  • Figure 1: Left panel: Trade-off between cost $\hat{V}_0(0,5)$ and frequency of inadmissible decisions $w_{freq}$ for the stationary model. Dark blue points correspond to $p=5\%$ probability constraint threshold and light grey ones to $p=1\%$. Center: Total cost $\hat{V}_0(0,5)$ (left axis, blue stars) and realized frequency of violations $w_{rlzd}$ (right axis, red circles) as functions of $p$ employing the LR model. Right: Locations $(L,I)$ of realized violations ${\sup_{s \in [t_n,t_{n+1}) }S^{m'}(s)>0}$ (red triangles), inadmissible decisions ${ \hat{u}(n,\mathbf{x}_n^{\hat{u},m'}) - u^{\min}_n(\mathbf{x}_n^{\hat{u},m'}) <0 }$ (circles with color representing the inadmissibility margin) on 5000 out-of-sample simulations using LR model. The constraint is binding in the white region and is not binding in the grey region.
  • Figure 2: Impact of the margin of error $\xi(\cdot, \cdot)$ on minimum admissible control $\hat{u}^{\min}$. We plot the difference between minimum admissible control for scenario 2 ($\hat{u}^{\min}(\cdot,I; \xi^{(0.95)}(L,I) )$) and scenario 3 ($\hat{u}^{\min}(\cdot,I; \xi = 4\% )$) with respect to scenario 1 ($\hat{u}^{\min}(\cdot,I; \xi = 0\% )$) using LR (left panel) and QR (right panel) models.
  • Figure 3: Model parameters, average trajectory of the state variables, control and their variance. Left panel: Average values of net demand $\frac{1}{M'}\sum_{m'=1}^{M'}L_n^{\hat{u},m'}$, inventory $\frac{1}{M'}\sum_{m'=1}^{M'}I_n^{\hat{u},m'}$ and optimal control (diesel) $\frac{1}{M'}\sum_{m'=1}^{M'}\hat{u}_n^{m'}$ processes using the gold standard strategy. Right panel: 95% confidence bands for net demand $L_n^{\hat{u}}$ and realized optimal diesel control $\hat{u}_n$. Net demand and diesel output is measured in kW and Inventory in kWh.
  • ...and 1 more figures

Theorems & Definitions (15)

  • Remark 2.1
  • Remark 2.2
  • Remark 2.3
  • Remark 2.4
  • Remark 3.1
  • Remark 3.2
  • Remark 3.3
  • Remark 3.4
  • Remark 3.5
  • Remark 3.6
  • ...and 5 more