Numerically Robust Fixed-Point Smoothing Without State Augmentation
Nicholas Krämer
TL;DR
This work targets fixed-point smoothing for Gaussian state-space models with unknown initial conditions, highlighting memory and numerical robustness gaps in existing methods. It proposes a novel recursion for p(x0|xK,y1:K) and implements it in a Cholesky-based form that avoids state augmentation, achieving $O(1)$ memory with $O(K D^3)$ runtime while maintaining flexibility across Gaussian parametrisations. The two concrete implementations (covariance-based and Cholesky-based) demonstrate comparable speed to the fastest fixed-point approaches and superior numerical robustness in challenging settings, with the Cholesky version excelling in probabilistic numerics and boundary-value problem tests. Three experiments—including efficiency, robustness against ill-conditioning, and a tracking parameter-estimation case study—validate memory savings, runtime competitiveness, and stability, suggesting practical impact for differential-equation solvers and dynamical-system parameter inference.
Abstract
Practical implementations of Gaussian smoothing algorithms have received a great deal of attention in the last 60 years. However, almost all work focuses on estimating complete time series (''fixed-interval smoothing'', $\mathcal{O}(K)$ memory) through variations of the Rauch--Tung--Striebel smoother, rarely on estimating the initial states (''fixed-point smoothing'', $\mathcal{O}(1)$ memory). Since fixed-point smoothing is a crucial component of algorithms for dynamical systems with unknown initial conditions, we close this gap by introducing a new formulation of a Gaussian fixed-point smoother. In contrast to prior approaches, our perspective admits a numerically robust Cholesky-based form (without downdates) and avoids state augmentation, which would needlessly inflate the state-space model and reduce the numerical practicality of any fixed-point smoother code. The experiments demonstrate how a JAX implementation of our algorithm matches the runtime of the fastest methods and the robustness of the most robust techniques while existing implementations must always sacrifice one for the other.
