Extended Kalman filter -- Koopman operator for tractable stochastic optimal control

Mohammad S. Ramadan; Mihai Anitescu

Extended Kalman filter -- Koopman operator for tractable stochastic optimal control

Mohammad S. Ramadan, Mihai Anitescu

TL;DR

This work reframes stochastic optimal control under partial observability by leveraging the Koopman operator to transform the uncertainty propagation, via the extended Kalman filter, into a lifted linear-quadratic problem solvable as an LQR. A certainty-equivalence surrogate yields a deterministic, lifted-state dynamics that can be learned with data using eDMD, enabling a tractable SOC-LQR controller. The method is demonstrated on a nonlinear system with varying observability, showing significant improvements over certainty-equivalence control in both cost and estimation accuracy. The results highlight the practical potential of Koopman-based control for complex SOC settings, while noting sensitivity to the choice of uncertainty model and dictionary functions.

Abstract

The theory of dual control was introduced more than seven decades ago. Although it has provided rich insights to the fields of control, estimation, and system identification, dual control is generally computationally prohibitive. In recent years, however, the use of Koopman operator theory for control applications has been emerging. This paper presents a new reformulation of the stochastic optimal control problem that, employing the Koopman operator, yields a standard LQR problem with the dual control as its solution. We provide a numerical example that demonstrates the effectiveness of the proposed approach compared with certainty equivalence control, when applied to systems with varying observability.

Extended Kalman filter -- Koopman operator for tractable stochastic optimal control

TL;DR

Abstract

Paper Structure (11 sections, 6 theorems, 21 equations, 4 figures, 1 table)

This paper contains 11 sections, 6 theorems, 21 equations, 4 figures, 1 table.

Introduction
Problem Formulation
Methodology
Equivalent description to the cost
Evolution of the central moments
Certainty equivalence of the information state
Structure of $T_\pi$
Toward the standard LQR form
Koopman and eDMD for control
Numerical Example
Conclusion

Key Result

Lemma 1

The term $\mathbb{E\,} x_k^\top Q x_k$ can be expressed by The expectation to the left can be expressed in its integral form as $\mathbb{E\,}\{\cdot \} = \int \cdot\, p(x_k) dx_k$ with respect to the density function $p(x_k)$By the Markov property, $p(x_k)=p(x_0)p(x_1\mid x_0)\hdots p(x_k,x_{k-1})$, the initial density $p(x_0)=p_0(x_0)$ is given., while the (The notation $()_{k \mid j}$ strictly

Figures (4)

Figure 1: (Training Stage): Block diagram illustrating the steps of the simulation-data collection stage, starting from a randomized initial condition $\eta_0$ and an injected, persistently exciting control sequence $\{u_k\}_{k \geq 0}$, next the creation of the data matrices $U_{data}$ and $\Psi_{data}$, then the application of eDMD, and ending with solving the LQR problem of the lifted system.
Figure 2: (Implementation stage): A block-diagram illustrating the online implementation of the SOC-LQR control of the lifted system in closed-loop over the original system "plant" \ref{['eq:stateSpace']}. The variables with a prime are functions of $y_k$, since the CE step \ref{['eq:CE_step']} is omitted in the implementation phase, according to Remark \ref{['remark1']}.
Figure 3: The SOC-LQR pushes the state to $\mathcal{H}_2$ (top), resulting in significantly lower estimation covariance (bottom).
Figure 4: SOC-LQR accurately controls and estimates the true state, whereas CE-LQR assumes the state is around zero, while estimation error is huge and true state is wandering.

Theorems & Definitions (12)

Lemma 1
proof
Proposition 1
proof
Remark 1
Proposition 2
Lemma 2
proof
Corollary 1
proof
...and 2 more

Extended Kalman filter -- Koopman operator for tractable stochastic optimal control

TL;DR

Abstract

Extended Kalman filter -- Koopman operator for tractable stochastic optimal control

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (12)