Computing Optimal Joint Chance Constrained Control Policies

Niklas Schmid; Marta Fochesato; Sarah H. Q. Li; Tobias Sutter; John Lygeros

Computing Optimal Joint Chance Constrained Control Policies

Niklas Schmid, Marta Fochesato, Sarah H. Q. Li, Tobias Sutter, John Lygeros

TL;DR

This work augments the dynamics via a binary state, allowing to characterize the optimal policies and develop a dynamic programming-based solution method for optimally controlling stochastic, Markovian systems subject to joint chance constraints over a finite-time horizon.

Abstract

We consider the problem of optimally controlling stochastic, Markovian systems subject to joint chance constraints over a finite-time horizon. For such problems, standard Dynamic Programming is inapplicable due to the time correlation of the joint chance constraints, which calls for non-Markovian, and possibly stochastic, policies. Hence, despite the popularity of this problem, solution approaches capable of providing provably-optimal and easy-to-compute policies are still missing. We fill this gap by augmenting the dynamics via a binary state, allowing us to characterize the optimal policies and develop a Dynamic Programming based solution method.

Computing Optimal Joint Chance Constrained Control Policies

TL;DR

Abstract

Paper Structure (15 sections, 10 theorems, 44 equations, 9 figures, 1 algorithm)

This paper contains 15 sections, 10 theorems, 44 equations, 9 figures, 1 algorithm.

Introduction
Preliminaries and Problem Formulation
Joint Chance Constr. Dynamic Programming
Lagrangian Dual Framework
Bilevel Framework
Attainability of Optimal Deterministic Policies
Attainability of Optimal Mixed Policies
Equivalence of Problems
Algorithmic Solution
Feasibility Check and Boundary Solutions
Bisection Algorithm
Numerical Example
Conclusions and future works
Appendix
Duality theory

Key Result

Theorem III.1

(Attainability of Deterministic Markov Policies) Given a fixed $\lambda \in \mathbb{R}_{\geq 0}$, there exists a measurable deterministic Markov policy that attains the infimum in eq_our_dp_recursion_innerdual at every time-step $k\in[N]$ and is also an optimal solution to Problem eq_inner_problem_d

Figures (9)

Figure 1: Graphical representation of the paper structure.
Figure 2: An illustration of a objective function for the outer maximization in Problem \ref{['eq_lagrange_dual_over_stochasticCausal']} (inspired by Ono_2). The policy $\pi_{\lambda}$ denotes an optimal argument to the inner minimization of Problem \ref{['eq_lagrange_dual_over_stochasticCausal']} under given $\lambda$ (assuming it exists) and $C_0^{\pi_{\lambda}}(\Tilde{x}_0),V_0^{\pi_{\lambda}}(\Tilde{x}_0)$ the cost and safety associated with that policy.
Figure 3: A Performance Set $P_{\Pi_{\text{mix}},\Tilde{x}_0}$ and Pareto front $P_{\Pi_{\text{mix}},\Tilde{x}_0}^{\star}$.
Figure 4: In the left plot, we can choose from policies which have a safety arbitrarily close to $\alpha$ and a control cost of $C$ or a policy that attains $\alpha$ but at cost $C+\delta$, $\delta>0$. Then, for any $\lambda$, there always exists a policy $\pi$ with safety close enough to $\alpha$ such that it is not optimal to incur the additional cost $\delta$, i.e., $\lambda(\alpha-V^{\pi}_0)<\delta$. The right plot depicts a similar border case.
Figure 5: Performance sets $P_{\Pi_{\text{d}},\Tilde{x}_0}$ (bordered set) and its convex hull $P_{\Pi_{\text{mix}},\Tilde{x}_0}$ (grey set), as well as the performance of the respective policies $\pi_{\underline{\lambda}},\pi_{\overline{\lambda}}$ (black stars), the performance of all mixed policies constructable from $\pi_{\underline{\lambda}}$ and $\pi_{\overline{\lambda}}$ (red line), and the optimal interpolation $\pi_{\text{mix}}$ according to equation \ref{['eq_stochastic_policies_ratios']} (red star). The variable $\lambda$ sets the optimization direction. The DP recursion returns the optimal policy in the performance set in this direction. Over the outer loop iterations $\underline{\lambda}$ approaches $\overline{\lambda}$.
...and 4 more figures

Theorems & Definitions (21)

Theorem III.1
proof
Definition III.2
Proposition III.3
proof
Corollary III.4
Lemma III.5: Attainability of Mixed Policies
proof
Corollary III.6
proof
...and 11 more

Computing Optimal Joint Chance Constrained Control Policies

TL;DR

Abstract

Computing Optimal Joint Chance Constrained Control Policies

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (21)