Nonlinear Kalman Filtering based on Self-Attention Mechanism and Lattice Trajectory Piecewise Linear Approximation

Jiaming Wang; Xinyu Geng; Jun Xu

Nonlinear Kalman Filtering based on Self-Attention Mechanism and Lattice Trajectory Piecewise Linear Approximation

Jiaming Wang, Xinyu Geng, Jun Xu

TL;DR

The paper tackles the sensitivity of Kalman filtering to model and noise inaccuracies in nonlinear systems. It introduces AtKF, which embeds a simplified self-attention network to learn an adaptive Kalman gain from history data, and pairs it with a batch-based pretraining scheme using LTPWL to avoid unstable recursive training. Key contributions include a concrete AtKF architecture, a batch data generation method via LTPWL, and empirical demonstrations showing improved robustness to noise and model mismatch on a 2D nonlinear system. This approach enhances traditional Kalman filtering by leveraging data-driven learning while preserving interpretability and enabling parallelizable training for practical applications.

Abstract

The traditional Kalman filter (KF) is widely applied in control systems, but it relies heavily on the accuracy of the system model and noise parameters, leading to potential performance degradation when facing inaccuracies. To address this issue, introducing neural networks into the KF framework offers a data-driven solution to compensate for these inaccuracies, improving the filter's performance while maintaining interpretability. Nevertheless, existing studies mostly employ recurrent neural network (RNN), which fails to fully capture the dependencies among state sequences and lead to an unstable training process. In this paper, we propose a novel Kalman filtering algorithm named the attention Kalman filter (AtKF), which incorporates a self-attention network to capture the dependencies among state sequences. To address the instability in the recursive training process, a parallel pre-training strategy is devised. Specifically, this strategy involves piecewise linearizing the system via lattice trajectory piecewise linear (LTPWL) expression, and generating pre-training data through a batch estimation algorithm, which exploits the self-attention mechanism's parallel processing ability. Experimental results on a two-dimensional nonlinear system demonstrate that AtKF outperforms other filters under noise disturbances and model mismatches.

Nonlinear Kalman Filtering based on Self-Attention Mechanism and Lattice Trajectory Piecewise Linear Approximation

TL;DR

Abstract

Paper Structure (18 sections, 1 theorem, 17 equations, 4 figures, 3 tables)

This paper contains 18 sections, 1 theorem, 17 equations, 4 figures, 3 tables.

INTRODUCTION
PRELIMINARIES
Self-attention Mechanism
Lattice Trajectory Piecewise Linear Expression
Kalman Filtering Algorithm with Attention Mechanism
System Model
Overall Architecture
Network Structure
Network Training
Training
PreTraining
EXPERIMENTS
System Function and Parameters
Experimental Setup
Results and Analysis
...and 3 more sections

Key Result

Lemma 1

Define vectors $z$ and $x$ as in (xz_definition), where $u_{k+1}=f(x_k)-{\frac{\partial f}{\partial x}}|_{x_k} x_{k}$, $\bar{y}_{k} = y_{k} - (h(x_k) - {\frac{\partial h}{\partial x}}|_{x_k} x_k)$, $A_{k} = {\frac{\partial f}{\partial x}}|_{x_k}$ and $C_{k}={\frac{\partial h}{\partial x}}|_{x_k}$. $Q_{k}$ and $R_{k}$ are noise covariance matrices i here $\check{x}_1$ represents the prior estimate

Figures (4)

Figure 1: (a) Self-attention Mechanism, (b) Simplified Attention Network.
Figure 2: An example of lattice trajectory piecewise linear expression.
Figure 3: (a) overall architecture, (b) self-attention mechanism network.
Figure 4: The true state and the estimated state (dimension 1) from different filters for data selected in the test dataset.

Theorems & Definitions (2)

Lemma 1
proof

Nonlinear Kalman Filtering based on Self-Attention Mechanism and Lattice Trajectory Piecewise Linear Approximation

TL;DR

Abstract

Nonlinear Kalman Filtering based on Self-Attention Mechanism and Lattice Trajectory Piecewise Linear Approximation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (2)