4D-Var using Hessian approximation and backpropagation applied to automatically-differentiable numerical and machine learning models
Kylen Solvik, Stephen G. Penny, Stephan Hoyer
TL;DR
The paper addresses the practical bottleneck of 4D-Var data assimilation, which traditionally relies on tangential linear models and adjoint models, by introducing Backprop-4DVar that leverages a Gauss-Newton-like Hessian approximation and backpropagation within automatic differentiation frameworks. It demonstrates that Backprop-4DVar can be applied to any differentiable forecast model, including ML surrogates, using JAX for gradient and Hessian computations, thereby simplifying implementation and reducing compute costs. Through experiments on Lorenz-96 and two-layer quasi-geostrophic dynamics (including a reservoir computing surrogate), the method achieves RMSE comparable to standard 4D-Var while delivering substantial speedups, with near-linear scaling as system size increases. The work highlights practical guidelines for learning-rate tuning via Bayesian optimization and showcases the potential for deeper integration of differentiable modeling and data assimilation in next-generation weather forecasting systems.
Abstract
Constraining a numerical weather prediction (NWP) model with observations via 4D variational (4D-Var) data assimilation is often difficult to implement in practice due to the need to develop and maintain a software-based tangent linear model and adjoint model. One of the most common 4D-Var algorithms uses an incremental update procedure, which has been shown to be an approximation of the Gauss-Newton method. Here we demonstrate that when using a forecast model that supports automatic differentiation, an efficient and in some cases more accurate alternative approximation of the Gauss-Newton method can be applied by combining backpropagation of errors with Hessian approximation. This approach can be used with either a conventional numerical model implemented within a software framework that supports automatic differentiation, or a machine learning (ML) based surrogate model. We test the new approach on a variety of Lorenz-96 and quasi-geostrophic models. The results indicate potential for a deeper integration of modeling, data assimilation, and new technologies in a next-generation of operational forecast systems that leverage weather models designed to support automatic differentiation.
