Computation-Aware Kalman Filtering and Smoothing

Marvin Pförtner; Jonathan Wenger; Jon Cockayne; Philipp Hennig

Computation-Aware Kalman Filtering and Smoothing

Marvin Pförtner, Jonathan Wenger, Jon Cockayne, Philipp Hennig

TL;DR

The paper tackles the prohibitive cost of inference in high-dimensional linear-Gaussian state-space models by introducing computation-aware Kalman filtering and smoothing (CAKF/CAKS). These methods use matrix-free, iterative updates with low-dimensional observation projections and downdate truncation to dramatically reduce time and memory while embedding the approximation error into the posterior uncertainty. Theoretical results provide complexity bounds and a pointwise error bound showing the CAKS uncertainty upper-bounds prediction error, and experiments on synthetic data and a large ERA5 climate dataset demonstrate superior performance and scalability relative to ensemble methods and standard KF/RTS. This work offers a practical route to scalable, uncertainty-aware spatiotemporal regression and GP-style inference in very large state spaces, leveraging GPU acceleration and probabilistic numerics principles.

Abstract

Kalman filtering and smoothing are the foundational mechanisms for efficient inference in Gauss-Markov models. However, their time and memory complexities scale prohibitively with the size of the state space. This is particularly problematic in spatiotemporal regression problems, where the state dimension scales with the number of spatial observations. Existing approximate frameworks leverage low-rank approximations of the covariance matrix. But since they do not model the error introduced by the computational approximation, their predictive uncertainty estimates can be overly optimistic. In this work, we propose a probabilistic numerical method for inference in high-dimensional Gauss-Markov models which mitigates these scaling issues. Our matrix-free iterative algorithm leverages GPU acceleration and crucially enables a tunable trade-off between computational cost and predictive uncertainty. Finally, we demonstrate the scalability of our method on a large-scale climate dataset.

Computation-Aware Kalman Filtering and Smoothing

TL;DR

Abstract

Paper Structure (47 sections, 16 theorems, 84 equations, 11 figures, 1 table, 7 algorithms)

This paper contains 47 sections, 16 theorems, 84 equations, 11 figures, 1 table, 7 algorithms.

INTRODUCTION
Challenges of a Large State-Space Dimension
Computation-Aware Filtering and Smoothing
BACKGROUND
Bayesian Inference in Linear-Gaussian State-Space Models
Spatiotemporal Regression
COMPUTATION-AWARE KALMAN FILTERING
From Matrix-y to Matrix-Free
Downdate Truncation
Choice of Policy
COMPUTATION-AWARE RTS SMOOTHING
THEORETICAL ANALYSIS
Computational Complexity
Error Bound for Spatiotemporal Regression
RELATED WORK
...and 32 more sections

Key Result

theorem 1

Let ${\mathbb{Z}} = [t_0, T] \times {\mathbb{X}}$ and define a space-time separable Gauss--Markov process ${\bm{\mathrm{f}}} \sim {\operatorname{\mathcal{GP}}\left({\bm{\mu}}, {\bm{\Sigma}}\right)}$ such that its first component ${\mathrm{f}} \coloneqq {\bm{\mathrm{f}}}_0 \sim {\operatorname{\mathca If $\sigma^2=0$, this also holds for training inputs ${\bm{z}} \in {\bm{Z}}^\textnormal{train}$.

Figures (11)

Figure 1: $D = 14640.0$
Figure 2: $D = 231360.0$
Figure 4: Comparison of the CAKF, the EnKF, and two variants of the ETKF on on-model data with state-space dimension $D = 20000.0$ while varying the rank parameters that govern the computational budget of the algorithms. The CAKF significantly outperforms the other filter variants for high-dimensional state spaces.
Figure 5: Predictive mean, predictive standard deviation, and pointwise absolute error for an increasing maximal number of iterations $\check{N}^{\textnormal{max}} \ge \check{N}_k$ per time step on a synthetic spatiotemporal regression problem.
Figure 6: Work-precision diagrams for the CAKF and CAKS on the ERA5 climate dataset. The plot shows the mean squared error (MSE) and average negative log density (NLD) of the computation-aware filter and smoother for different problem sizes (i.e., state-space dimension) and number of iterations on the train and test set. The predictive error measured by test MSE decreases with larger problem sizes, while the test NLD increases. This is because we limit the computational budget and thus run fewer iterations for larger problems, i.e., we trade reduced computation cost for increased uncertainty.
...and 6 more figures

Theorems & Definitions (34)

theorem 1: Pointwise Worst-Case Prediction Error
definition B.1: Linear-Gaussian State-Space Model
theorem B.2: Kalman Filter
proposition B.3: Downdate-Form Kalman Filter
proof
theorem B.4: RTS Smoother
proposition B.5: Inverse-Free RTS Smoother
proof
lemma B.6: Matheron's Rule
proof
...and 24 more

Computation-Aware Kalman Filtering and Smoothing

TL;DR

Abstract

Computation-Aware Kalman Filtering and Smoothing

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (34)