An Introduction to Solving the Least-Squares Problem in Variational Data Assimilation
I. Daužickaitė, M. A. Freitag, S. Gürol, A. S. Lawless, A. Ramage, J. A. Scott, J. M. Tabeart
TL;DR
This paper reframes variational data assimilation (VarDA) as a sequence of large-scale generalized nonlinear least-squares problems and provides a unified numerical linear algebra perspective. It clarifies the weak and strong 4DVar formulations, analyzes direct and iterative solution strategies, and develops a comprehensive preconditioning framework (first- and second-level, plus augmented-system approaches) to enable efficient Krylov solvers under strong computational constraints. It also discusses preconditioning for sequences of systems, time-parallelization advantages for augmented systems, and practical challenges such as early stopping, solver stability, and software availability, highlighting future directions including randomized, multi-precision, and ML-informed methods. The practical impact lies in guiding the design of scalable, robust solvers for real-world, large-scale geophysical data assimilation where model evaluations dominate cost and rapid turnaround is essential. Mathematical rigor is deployed to connect the VarDA formulations with their linear-algebra subproblems, enabling more targeted and transferable numerical methods across applications.
Abstract
Variational data assimilation is a technique for combining measured data with dynamical models. It is a key component of Earth system state estimation and is commonly used in weather and ocean forecasting. The approach involves a large-scale generalized nonlinear least-squares problem. Solving the resulting sequence of sparse linear subproblems requires the use of sophisticated numerical linear algebra methods. In practical applications, the computational demands severely limit the number of iterations of a Krylov subspace solver that can be performed and so high-quality preconditioners are vital. In this paper, we present a numerical linear algebra perspective on variational data assimilation and discuss contemporary solution methods for the challenges posed by large-scale geophysical applications. The principal contribution is a focused treatment of the underlying linear algebraic subproblems, accompanied by a concise and clear introduction to the essential concepts of variational data assimilation and an extensive bibliography.
