Predicting Forecast Error for the HRRR Using LSTM Neural Networks: A Comparative Study Using New York and Oklahoma State Mesonets
David Aaron Evans, Kara J. Sulia, Nick P. Bassill, Chris D. Thorncroft, Jay C. Rothenberger, Lauriana C. Gaudet
TL;DR
This study develops LSTM-based encoders and decoders to predict HRRR forecast error at NYSM and OKSM mesonet sites, training on 2018–2023 data and testing in 2024. It leverages co-located HRRR observations, topographic and land-use features, time-encoded inputs, and an outlier-focused loss to enhance detection of large errors. Across precipitation, wind, and temperature, the LSTM shows strongest skill for precipitation, with domain-dependent improvements and distinctive diurnal/seasonal patterns tied to terrain and atmospheric dynamics; OKSM generally yields better predictive performance than NYSM due to more homogeneous terrain. These results demonstrate the potential for location-specific ML-based forecast-error predictions to support real-time, point-of-use uncertainty assessment in high-resolution NWP systems, and point to future work integrating higher-altitude information and further regional optimization. $OutlierFocusedLoss$ is central to emphasising rare, impactful errors, and the approach can be extended to other NWP systems and meso-scale networks to aid forecasters and risk assessments.
Abstract
Long Short-Term Memory (LSTM) models are trained to predict forecast error for the High-Resolution Rapid Refresh (HRRR) model using the New York State Mesonet and Oklahoma State Mesonet near-surface weather observations as ground truth. Physical and dynamical mechanisms tied to LSTM performance are evaluated by comparing the New York domain to the Oklahoma domain. The contrasting geography and atmospheric dynamics of the two domains provide a compelling scientific foil. Evaluating them side by side highlights variations in LSTM prediction of forecast error that are closely linked to region-specific phenomena driven by both dynamics and geography. Using mean-absolute-error and percent improvement relative to HRRR, LSTMs predict precipitation error most accurately, followed by wind error and then temperature error. Precipitation errors exhibit an asymmetry, with overforecast precipitation detected more accurately than underforecast, while wind error predictions are consistent across over- and underforecast predictions. Temperature error predictions are relatively accurate but smoother, with respect to variance, than true observations. This paper describes an overview of LSTM performance with the expressed intent of providing forecasters with real-time predictions of forecast error at the point of use within the New York State and Oklahoma State Mesonets. This research demonstrates the potential of LSTM-based machine learning models to provide actionable, location-specific predictions of forecast error for high-resolution operational numerical weather prediction (NWP) systems.
