Table of Contents
Fetching ...

Dual Ensemble Kalman Filter for Stochastic Optimal Control

Anant A. Joshi, Amirhossein Taghvaei, Prashant G. Mehta, Sean P. Meyn

TL;DR

The main contribution is a simulation-based algorithm – dual ensemble Kalman filter (EnKF) – to numerically approximate the solution of stochastic optimal control problems in continuous time and space.

Abstract

In this paper, stochastic optimal control problems in continuous time and space are considered. In recent years, such problems have received renewed attention from the lens of reinforcement learning (RL) which is also one of our motivation. The main contribution is a simulation-based algorithm -- dual ensemble Kalman filter (EnKF) -- to numerically approximate the solution of these problems. The paper extends our previous work where the dual EnKF was applied in deterministic settings of the problem. The theoretical results and algorithms are illustrated with numerical experiments.

Dual Ensemble Kalman Filter for Stochastic Optimal Control

TL;DR

The main contribution is a simulation-based algorithm – dual ensemble Kalman filter (EnKF) – to numerically approximate the solution of stochastic optimal control problems in continuous time and space.

Abstract

In this paper, stochastic optimal control problems in continuous time and space are considered. In recent years, such problems have received renewed attention from the lens of reinforcement learning (RL) which is also one of our motivation. The main contribution is a simulation-based algorithm -- dual ensemble Kalman filter (EnKF) -- to numerically approximate the solution of these problems. The paper extends our previous work where the dual EnKF was applied in deterministic settings of the problem. The theoretical results and algorithms are illustrated with numerical experiments.
Paper Structure (18 sections, 2 theorems, 35 equations, 2 figures, 6 tables)

This paper contains 18 sections, 2 theorems, 35 equations, 2 figures, 6 tables.

Key Result

Proposition 1

Consider the mean-field process eq:Ybar. Suppose $\text{Cov}(\eta)$ is Selected according to Table tb:soln, $\mathcal{I}$ and $\mathcal{C}$ satisfy the PDEs where $\mathcal{V}_t(\cdot)$, $h$ also appear in Table tb:soln, and $\hat{h}_t \coloneqq \int \bar{p}_t(z) h_t(z) \mathrm{d} z$. Then, where $\bar{p}_t$ is the probability density function of ${Y}_t$ and $p_t$ is defined in eq:pt in terms of

Figures (2)

  • Figure 1: Performance of all three algorithms on inverted pendulum on cart. The task is to stabilise the system state at $x=0$ and $\theta = \pi$.
  • Figure 2: Performance of all three algorithms on spring mass damper. The dashed lines represent the solutions of the respective AREs, and the solid lines are the solutions obtained by running the EnKF algorithm.

Theorems & Definitions (5)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Remark 1