Dual Ensemble Kalman Filter for Stochastic Optimal Control

Anant A. Joshi; Amirhossein Taghvaei; Prashant G. Mehta; Sean P. Meyn

Dual Ensemble Kalman Filter for Stochastic Optimal Control

Anant A. Joshi, Amirhossein Taghvaei, Prashant G. Mehta, Sean P. Meyn

TL;DR

The main contribution is a simulation-based algorithm – dual ensemble Kalman filter (EnKF) – to numerically approximate the solution of stochastic optimal control problems in continuous time and space.

Abstract

In this paper, stochastic optimal control problems in continuous time and space are considered. In recent years, such problems have received renewed attention from the lens of reinforcement learning (RL) which is also one of our motivation. The main contribution is a simulation-based algorithm -- dual ensemble Kalman filter (EnKF) -- to numerically approximate the solution of these problems. The paper extends our previous work where the dual EnKF was applied in deterministic settings of the problem. The theoretical results and algorithms are illustrated with numerical experiments.

Dual Ensemble Kalman Filter for Stochastic Optimal Control

TL;DR

Abstract

Paper Structure (18 sections, 2 theorems, 35 equations, 2 figures, 6 tables)

This paper contains 18 sections, 2 theorems, 35 equations, 2 figures, 6 tables.

Introduction
Contribution of this paper
Relationship to literature
Organisation of paper
Problem Formulation
Deterministic Optimal Control (DOC)
Stochastic Optimal Control (SOC)
Risk Sensitive Control (RSC)
Linear Quadratic (LQ) Control
Literature survey
Proposed methodology
Transforming value function to probability density
Mean-field process
LQ setting
Gaussian Approximation
...and 3 more sections

Key Result

Proposition 1

Consider the mean-field process eq:Ybar. Suppose $\text{Cov}(\eta)$ is Selected according to Table tb:soln, $\mathcal{I}$ and $\mathcal{C}$ satisfy the PDEs where $\mathcal{V}_t(\cdot)$, $h$ also appear in Table tb:soln, and $\hat{h}_t \coloneqq \int \bar{p}_t(z) h_t(z) \mathrm{d} z$. Then, where $\bar{p}_t$ is the probability density function of ${Y}_t$ and $p_t$ is defined in eq:pt in terms of

Figures (2)

Figure 1: Performance of all three algorithms on inverted pendulum on cart. The task is to stabilise the system state at $x=0$ and $\theta = \pi$.
Figure 2: Performance of all three algorithms on spring mass damper. The dashed lines represent the solutions of the respective AREs, and the solid lines are the solutions obtained by running the EnKF algorithm.

Theorems & Definitions (5)

Proposition 1
proof
Proposition 2
proof
Remark 1

Dual Ensemble Kalman Filter for Stochastic Optimal Control

TL;DR

Abstract

Dual Ensemble Kalman Filter for Stochastic Optimal Control

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (5)