Table of Contents
Fetching ...

Weakly-Constrained 4D Var for Downscaling with Uncertainty using Data-Driven Surrogate Models

Philip Dinenis, Vishwas Rao, Mihai Anitescu

TL;DR

This work presents a framework that stabilizes fast data-driven downscaling with FourCastNet by embedding it in a weakly constrained $4DVar$ data assimilation scheme. It introduces a Gaussian model-error term, estimates a diagonal $\mathbf{Q}$, and uses LBFGS with automatic differentiation to recover high-resolution trajectories while providing posterior uncertainty through a Laplace approximation. The approach is tested on ERA5 hurricane data (Hurricane Michael), showing improved forecast accuracy and uncertainty quantification over EnKF and unassimilated FourCastNet, and it demonstrates the ability to recover fine-scale features and track extremes more reliably. The results suggest substantial potential for real-time, high-resolution weather forecasting and risk assessment, with avenues to extend to longer horizons, other surrogates, and regionally varying resolutions.

Abstract

Dynamic downscaling typically involves using numerical weather prediction (NWP) solvers to refine coarse data to higher spatial resolutions. Data-driven models such as FourCastNet have emerged as a promising alternative to the traditional NWP models for forecasting. Once these models are trained, they are capable of delivering forecasts in a few seconds, thousands of times faster compared to classical NWP models. However, as the lead times, and, therefore, their forecast window, increase, these models show instability in that they tend to diverge from reality. In this paper, we propose to use data assimilation approaches to stabilize them when used for downscaling tasks. Data assimilation uses information from three different sources, namely an imperfect computational model based on partial differential equations (PDE), from noisy observations, and from an uncertainty-reflecting prior. In this work, when carrying out dynamic downscaling, we replace the computationally expensive PDE-based NWP models with FourCastNet in a ``weak-constrained 4DVar framework" that accounts for the implied model errors. We demonstrate the efficacy of this approach for a hurricane-tracking problem; moreover, the 4DVar framework naturally allows the expression and quantification of uncertainty. We demonstrate, using ERA5 data, that our approach performs better than the ensemble Kalman filter (EnKF) and the unstabilized FourCastNet model, both in terms of forecast accuracy and forecast uncertainty.

Weakly-Constrained 4D Var for Downscaling with Uncertainty using Data-Driven Surrogate Models

TL;DR

This work presents a framework that stabilizes fast data-driven downscaling with FourCastNet by embedding it in a weakly constrained data assimilation scheme. It introduces a Gaussian model-error term, estimates a diagonal , and uses LBFGS with automatic differentiation to recover high-resolution trajectories while providing posterior uncertainty through a Laplace approximation. The approach is tested on ERA5 hurricane data (Hurricane Michael), showing improved forecast accuracy and uncertainty quantification over EnKF and unassimilated FourCastNet, and it demonstrates the ability to recover fine-scale features and track extremes more reliably. The results suggest substantial potential for real-time, high-resolution weather forecasting and risk assessment, with avenues to extend to longer horizons, other surrogates, and regionally varying resolutions.

Abstract

Dynamic downscaling typically involves using numerical weather prediction (NWP) solvers to refine coarse data to higher spatial resolutions. Data-driven models such as FourCastNet have emerged as a promising alternative to the traditional NWP models for forecasting. Once these models are trained, they are capable of delivering forecasts in a few seconds, thousands of times faster compared to classical NWP models. However, as the lead times, and, therefore, their forecast window, increase, these models show instability in that they tend to diverge from reality. In this paper, we propose to use data assimilation approaches to stabilize them when used for downscaling tasks. Data assimilation uses information from three different sources, namely an imperfect computational model based on partial differential equations (PDE), from noisy observations, and from an uncertainty-reflecting prior. In this work, when carrying out dynamic downscaling, we replace the computationally expensive PDE-based NWP models with FourCastNet in a ``weak-constrained 4DVar framework" that accounts for the implied model errors. We demonstrate the efficacy of this approach for a hurricane-tracking problem; moreover, the 4DVar framework naturally allows the expression and quantification of uncertainty. We demonstrate, using ERA5 data, that our approach performs better than the ensemble Kalman filter (EnKF) and the unstabilized FourCastNet model, both in terms of forecast accuracy and forecast uncertainty.

Paper Structure

This paper contains 28 sections, 40 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Graphical model denoting conditional dependence of random variables used for estimating the posterior in section \ref{['sec:UQ']}. Shaded circles represent observed variables, and unshaded circles represent unobserved variables whose posterior we seek to estimate
  • Figure 2: Normalized $U_{10}$ zonal wind variable during Hurricane Michael at 06:00 October 9th UTC on a subregion from 70$^\circ$W to 110$^\circ$W and 10$^\circ$N to 50$^\circ$N The left is the ground-truth ERA5 fine-scale $\mathbf{x}_k$. The right is the coarse and noisy observation $\mathbf{y}_k$
  • Figure 3: Comparison of RMSE (as defined by \ref{['eqn:rmse']}) for the WC4DVAR, EnKF, and unassimilated forecasts.
  • Figure 4: Plot of zonal wind (normalized) at 06:00UTC October 9th, 2018. Left to right are the observation ($\mathbf{y}$), the ground truth ($\mathbf{x}^{\rm true}$), WC4DVAR result ($\mathbf{x}^*$) and the residual ($\mathbf{x}^*-\mathbf{x}^{\rm true}$) Each figure uses the same color scale and depicts the same sub-region from 70$^\circ$W to 110$^\circ$W and 10$^\circ$N to 50$^\circ$N.
  • Figure 5: Plot of zonal wind (normalized) at time 06:00UTC October 9th 2018. Left to right are the observation ($\mathbf{y}$), the EnKF result ($\mathbf{x}^{\rm EnKF}$) and the residual ($\mathbf{x}^{\rm EnKF}-\mathbf{x}^{\rm true}$) Each figure uses the same color scale and depicts the same sub-region from 70$^\circ$W to 110$^\circ$W and 10$^\circ$N to 50$^\circ$N.
  • ...and 2 more figures