Learning Optimal Filters Using Variational Inference
Eviatar Bach, Ricardo Baptista, Enoch Luk, Andrew Stuart
TL;DR
This paper introduces a variational inference framework to learn parameterized analysis maps for data assimilation, aiming to improve the filtering distribution in high-dimensional nonlinear systems. By formulating an objective that minimizes the KL divergence between approximate and forecast posteriors and incorporating a likelihood term, the authors derive offline, online, and sample-based formulations for learning $A_\theta$ (or an equivalent transport map $T_\theta$) that update forecasts with observations. The framework is validated across linear and nonlinear models, including a stable linear system, Lorenz '96, and Kuramoto–Sivashinsky, demonstrating learning of a fixed Kalman gain, inflation, and localization parameters in EnKF, and illustrating the potential to design new learned filtering algorithms. Overall, the method provides a probabilistically principled route to tailor data assimilation operators to data, reducing bias and enabling flexible, learned filters with practical impact for weather, climate, and related applications.
Abstract
Filtering - the task of estimating the conditional distribution for states of a dynamical system given partial and noisy observations - is important in many areas of science and engineering, including weather and climate prediction. However, the filtering distribution is generally intractable to obtain for high-dimensional, nonlinear systems. Filters used in practice, such as the ensemble Kalman filter (EnKF), provide biased probabilistic estimates for nonlinear systems and have numerous tuning parameters. Here, we present a framework for learning a parameterized analysis map - the transformation that takes samples from a forecast distribution, and combines with an observation, to update the approximate filtering distribution - using variational inference. In principle this can lead to a better approximation of the filtering distribution, and hence smaller bias. We show that this methodology can be used to learn the gain matrix, in an affine analysis map, for filtering linear and nonlinear dynamical systems; we also study the learning of inflation and localization parameters for an EnKF. The framework developed here can also be used to learn new filtering algorithms with more general forms for the analysis map.
