Scalable Data Assimilation with Message Passing
Oscar Key, So Takao, Daniel Giles, Marc Peter Deisenroth
TL;DR
This work reframes data assimilation (DA) for large-scale numerical weather prediction as Bayesian inference on a Gaussian Markov random field (GMRF) and solves it with a message-passing algorithm on a factor graph derived from a Matérn GP prior via SPDE discretization. The approach enables distributed, asynchronous computation with minimal inter-domain communication, and a GPU-accelerated implementation demonstrates competitive accuracy and efficiency against a GPU-accelerated 3D-Var baseline on both simulated data and a large-scale global surface temperature example. Key innovations include incorporating observations into the factor graph, damping, early stopping, and a multigrid strategy to accelerate convergence, yielding strong performance on grids up to millions of points. Limitations include unreliable posterior uncertainties on loopy graphs and a Gaussian prior restriction; the authors discuss extending to spatio-temporal DA and non-Gaussian priors, highlighting the method's potential for scalable, high-resolution forecasting. Overall, the paper presents a viable, scalable alternative to variational methods that aligns well with modern heterogeneous computing architectures and paves the way for future operational-scale extensions.
Abstract
Data assimilation is a core component of numerical weather prediction systems. The large quantity of data processed during assimilation requires the computation to be distributed across increasingly many compute nodes, yet existing approaches suffer from synchronisation overhead in this setting. In this paper, we exploit the formulation of data assimilation as a Bayesian inference problem and apply a message-passing algorithm to solve the spatial inference problem. Since message passing is inherently based on local computations, this approach lends itself to parallel and distributed computation. In combination with a GPU-accelerated implementation, we can scale the algorithm to very large grid sizes while retaining good accuracy and compute and memory requirements.
