Observation-driven correction of numerical weather prediction for marine winds
Matteo Peduto, Qidong Yang, Jonathan Giezendanner, Devis Tuia, Sherrie Wang
TL;DR
Open-ocean wind forecasts suffer from sparse and heterogeneous observations, limiting NWP accuracy. The authors propose an observation-informed correction framework that feeds recent in-situ observations into a transformer to adjust GFS outputs, rather than predicting winds from scratch. The model uses masking to handle irregular data, cross-attention to condition on target points, and position/time embeddings for arbitrary coordinates, achieving 45% RMSE reduction at 1 h and 13% at 48 h over the Atlantic, with strongest gains along coastlines and shipping lanes. This approach offers a practical, low-latency post-processing tool that complements NWP for maritime safety, routing, and offshore operations, and scales to grid-wide and site-specific forecasts in a single pass.
Abstract
Accurate marine wind forecasts are essential for safe navigation, ship routing, and energy operations, yet they remain challenging because observations over the ocean are sparse, heterogeneous, and temporally variable. We reformulate wind forecasting as observation-informed correction of a global numerical weather prediction (NWP) model. Rather than forecasting winds directly, we learn local correction patterns by assimilating the latest in-situ observations to adjust the Global Forecast System (GFS) output. We propose a transformer-based deep learning architecture that (i) handles irregular and time-varying observation sets through masking and set-based attention mechanisms, (ii) conditions predictions on recent observation-forecast pairs via cross-attention, and (iii) employs cyclical time embeddings and coordinate-aware location representations to enable single-pass inference at arbitrary spatial coordinates. We evaluate our model over the Atlantic Ocean using observations from the International Comprehensive Ocean-Atmosphere Data Set (ICOADS) as reference. The model reduces GFS 10-meter wind RMSE at all lead times up to 48 hours, achieving 45% improvement at 1-hour lead time and 13% improvement at 48-hour lead time. Spatial analyses reveal the most persistent improvements along coastlines and shipping routes, where observations are most abundant. The tokenized architecture naturally accommodates heterogeneous observing platforms (ships, buoys, tide gauges, and coastal stations) and produces both site-specific predictions and basin-scale gridded products in a single forward pass. These results demonstrate a practical, low-latency post-processing approach that complements NWP by learning to correct systematic forecast errors.
