Extended Kalman Filtering for Recursive Online Discrete-Time Inverse Optimal Control

Tian Zhao; Timothy L. Molloy

Extended Kalman Filtering for Recursive Online Discrete-Time Inverse Optimal Control

Tian Zhao, Timothy L. Molloy

TL;DR

This work forms the discrete-time inverse optimal control problem of inferring unknown parameters in the objective function of an optimal control problem from measurements of optimal states and controls as a nonlinear filtering problem as a novel extended Kalman filter (EKF) for solving inverse optimal control problems in a computationally efficient recursive online manner.

Abstract

We formulate the discrete-time inverse optimal control problem of inferring unknown parameters in the objective function of an optimal control problem from measurements of optimal states and controls as a nonlinear filtering problem. This formulation enables us to propose a novel extended Kalman filter (EKF) for solving inverse optimal control problems in a computationally efficient recursive online manner that requires only a single pass through the measurement data. Importantly, we show that the Jacobians required to implement our EKF can be computed efficiently by exploiting recent Pontryagin differentiable programming results, and that our consideration of an EKF enables the development of first-of-their-kind theoretical error guarantees for online inverse optimal control with noisy incomplete measurements. Our proposed EKF is shown to be significantly faster than an alternative unscented Kalman filter-based approach.

Extended Kalman Filtering for Recursive Online Discrete-Time Inverse Optimal Control

TL;DR

Abstract

Paper Structure (14 sections, 2 theorems, 17 equations, 2 figures, 1 table, 2 algorithms)

This paper contains 14 sections, 2 theorems, 17 equations, 2 figures, 1 table, 2 algorithms.

Introduction
Problem Formulation
Proposed Extended Kalman Filter for Online Inverse Optimal Control
Online Inverse Optimal Control as Nonlinear Filtering
EKF Algorithm
Jacobian via Pontryagin Differentiable Programming
Proposed Algorithm Summary
Error Analysis and Guarantee
Simulation Results
Benchmark Problems
Noise Simulations
Computational Efficiency
Complete versus Incomplete Measurement Simulation
Conclusion

Key Result

Proposition III.1

Suppose that $H_k^{uu}$ is invertible for all $0 \leq k \leq T-1$. Then $X_{0:T}^{\hat{\theta}_{t-1}} \triangleq \{X_k(\hat{\theta}_{t-1}) : 0 \leq k \leq T \}$ and $U_{0:T-1}^{\hat{\theta}_{t-1}} \triangleq \{ U_k(\hat{\theta}_{t-1}) : 0 \leq k \leq T-1 \}$ can be obtained via the recursions: for $0 \leq k \leq T-1$ where $\mathcal{P}_T = H_T^{xx}$ and $\mathcal{W}_T = H_T^{xe}$, together with

Figures (2)

Figure 1: Performance of our proposed EKF on benchmark problems with ground truth (GT): (a) Single pendulum with ground truth parameters $[\theta_1 = 1, \theta_2 = 10]$; (b) Cart pole with ground truth parameters $[\theta_1 = 2, \theta_2 = 4, \theta_3 = 1.5, \theta_4 = 1]$; (c) Quadrotor with ground truth parameters $[\theta_1=1.0, \theta_2 = 1.5, \theta_3 = 2, \theta_4 = 0.5]$; (d) Robot arm with ground truth parameters $[\theta_1=1.0, \theta_2 = 1.5, \theta_3 = 2, \theta_4 = 0.5]$; (e) Rocket powered landing with ground truth parameters $[\theta_1=1.0, \theta_2 = 1.5, \theta_3 = 2, \theta_4 = 2.5, \theta_5 =5 ]$; (f) Execution time for each time step with respect to EKF and UKF.
Figure 2: Estimated parameters from our proposed EKF with complete and incomplete measurements for solving the single pendulum benchmark problem.

Theorems & Definitions (5)

Proposition III.1
proof
Definition IV.1: Mean-Squared Boundedness reif1999stochastic
Proposition IV.1
proof

Extended Kalman Filtering for Recursive Online Discrete-Time Inverse Optimal Control

TL;DR

Abstract

Extended Kalman Filtering for Recursive Online Discrete-Time Inverse Optimal Control

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (5)