Table of Contents
Fetching ...

Coordinate ascent neural Kalman-MLE for state estimation

Bettina Hanlon, Angel Garcia Fernandez

TL;DR

CAN-Kalman-MLE addresses state estimation when both the dynamic and measurement models are unknown by jointly learning neural approximations $f_{\theta_g}$ and $h_{\theta_l}$ and the noise covariances $Q$ and $R$ via coordinate ascent MLE. The method models transitions as $g_{\theta_g}(x_k|x_{k-1}) = \mathcal{N}(x_k; f_{\theta_g}(x_{k-1}), Q)$ and measurements as $l_{\theta_l}(z_k|x_k) = \mathcal{N}(z_k; h_{\theta_l}(x_k), R)$, with $Q$ and $R$ updated in closed form during training, and neural weights optimized with SGD/ADAM. After training, a non-linear Kalman filter (e.g., UKF) runs with the learned models to estimate the state in testing. Experiments on bilateration-based tracking and a Lorenz attractor demonstrate that CAN-Kalman-MLE can match or exceed Kalman-MLE and often outperform KalmanNet, approaching UKF performance when models are well-estimated, and highlighting the method’s potential for principled, supervised learning in uncertain state-space models.

Abstract

This paper presents a coordinate ascent algorithm to learn dynamic and measurement models in dynamic state estimation using maximum likelihood estimation in a supervised manner. In particular, the dynamic and measurement models are assumed to be Gaussian and the algorithm learns the neural network parameters that model the dynamic and measurement functions, and also the noise covariance matrices. The trained dynamic and measurement models are then used with a non-linear Kalman filter algorithm to estimate the state during the testing phase.

Coordinate ascent neural Kalman-MLE for state estimation

TL;DR

CAN-Kalman-MLE addresses state estimation when both the dynamic and measurement models are unknown by jointly learning neural approximations and and the noise covariances and via coordinate ascent MLE. The method models transitions as and measurements as , with and updated in closed form during training, and neural weights optimized with SGD/ADAM. After training, a non-linear Kalman filter (e.g., UKF) runs with the learned models to estimate the state in testing. Experiments on bilateration-based tracking and a Lorenz attractor demonstrate that CAN-Kalman-MLE can match or exceed Kalman-MLE and often outperform KalmanNet, approaching UKF performance when models are well-estimated, and highlighting the method’s potential for principled, supervised learning in uncertain state-space models.

Abstract

This paper presents a coordinate ascent algorithm to learn dynamic and measurement models in dynamic state estimation using maximum likelihood estimation in a supervised manner. In particular, the dynamic and measurement models are assumed to be Gaussian and the algorithm learns the neural network parameters that model the dynamic and measurement functions, and also the noise covariance matrices. The trained dynamic and measurement models are then used with a non-linear Kalman filter algorithm to estimate the state during the testing phase.

Paper Structure

This paper contains 12 sections, 15 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: Block diagram showing the architecture of the multi-layer perceptron used to learn the dynamic function $f(\cdot)$. It consists of three fully connected layers, two ReLU activation functions and a dropout layer after the first ReLU. Exact inputs and outputs vary dependent on scenario.
  • Figure 2: RMSE at every time step for Scenario 1 where $T=50$ and $\sigma_u^2 = 0.001$ and $\sigma_r^2 =0.001$.
  • Figure 3: RMSE at every time step for Scenario 1 where $T=50$ and $\sigma_u^2 = 0.1$ and $\sigma_r^2 =1$.