Table of Contents
Fetching ...

Predicting Forecast Error for the HRRR Using LSTM Neural Networks: A Comparative Study Using New York and Oklahoma State Mesonets

David Aaron Evans, Kara J. Sulia, Nick P. Bassill, Chris D. Thorncroft, Jay C. Rothenberger, Lauriana C. Gaudet

TL;DR

This study develops LSTM-based encoders and decoders to predict HRRR forecast error at NYSM and OKSM mesonet sites, training on 2018–2023 data and testing in 2024. It leverages co-located HRRR observations, topographic and land-use features, time-encoded inputs, and an outlier-focused loss to enhance detection of large errors. Across precipitation, wind, and temperature, the LSTM shows strongest skill for precipitation, with domain-dependent improvements and distinctive diurnal/seasonal patterns tied to terrain and atmospheric dynamics; OKSM generally yields better predictive performance than NYSM due to more homogeneous terrain. These results demonstrate the potential for location-specific ML-based forecast-error predictions to support real-time, point-of-use uncertainty assessment in high-resolution NWP systems, and point to future work integrating higher-altitude information and further regional optimization. $OutlierFocusedLoss$ is central to emphasising rare, impactful errors, and the approach can be extended to other NWP systems and meso-scale networks to aid forecasters and risk assessments.

Abstract

Long Short-Term Memory (LSTM) models are trained to predict forecast error for the High-Resolution Rapid Refresh (HRRR) model using the New York State Mesonet and Oklahoma State Mesonet near-surface weather observations as ground truth. Physical and dynamical mechanisms tied to LSTM performance are evaluated by comparing the New York domain to the Oklahoma domain. The contrasting geography and atmospheric dynamics of the two domains provide a compelling scientific foil. Evaluating them side by side highlights variations in LSTM prediction of forecast error that are closely linked to region-specific phenomena driven by both dynamics and geography. Using mean-absolute-error and percent improvement relative to HRRR, LSTMs predict precipitation error most accurately, followed by wind error and then temperature error. Precipitation errors exhibit an asymmetry, with overforecast precipitation detected more accurately than underforecast, while wind error predictions are consistent across over- and underforecast predictions. Temperature error predictions are relatively accurate but smoother, with respect to variance, than true observations. This paper describes an overview of LSTM performance with the expressed intent of providing forecasters with real-time predictions of forecast error at the point of use within the New York State and Oklahoma State Mesonets. This research demonstrates the potential of LSTM-based machine learning models to provide actionable, location-specific predictions of forecast error for high-resolution operational numerical weather prediction (NWP) systems.

Predicting Forecast Error for the HRRR Using LSTM Neural Networks: A Comparative Study Using New York and Oklahoma State Mesonets

TL;DR

This study develops LSTM-based encoders and decoders to predict HRRR forecast error at NYSM and OKSM mesonet sites, training on 2018–2023 data and testing in 2024. It leverages co-located HRRR observations, topographic and land-use features, time-encoded inputs, and an outlier-focused loss to enhance detection of large errors. Across precipitation, wind, and temperature, the LSTM shows strongest skill for precipitation, with domain-dependent improvements and distinctive diurnal/seasonal patterns tied to terrain and atmospheric dynamics; OKSM generally yields better predictive performance than NYSM due to more homogeneous terrain. These results demonstrate the potential for location-specific ML-based forecast-error predictions to support real-time, point-of-use uncertainty assessment in high-resolution NWP systems, and point to future work integrating higher-altitude information and further regional optimization. is central to emphasising rare, impactful errors, and the approach can be extended to other NWP systems and meso-scale networks to aid forecasters and risk assessments.

Abstract

Long Short-Term Memory (LSTM) models are trained to predict forecast error for the High-Resolution Rapid Refresh (HRRR) model using the New York State Mesonet and Oklahoma State Mesonet near-surface weather observations as ground truth. Physical and dynamical mechanisms tied to LSTM performance are evaluated by comparing the New York domain to the Oklahoma domain. The contrasting geography and atmospheric dynamics of the two domains provide a compelling scientific foil. Evaluating them side by side highlights variations in LSTM prediction of forecast error that are closely linked to region-specific phenomena driven by both dynamics and geography. Using mean-absolute-error and percent improvement relative to HRRR, LSTMs predict precipitation error most accurately, followed by wind error and then temperature error. Precipitation errors exhibit an asymmetry, with overforecast precipitation detected more accurately than underforecast, while wind error predictions are consistent across over- and underforecast predictions. Temperature error predictions are relatively accurate but smoother, with respect to variance, than true observations. This paper describes an overview of LSTM performance with the expressed intent of providing forecasters with real-time predictions of forecast error at the point of use within the New York State and Oklahoma State Mesonets. This research demonstrates the potential of LSTM-based machine learning models to provide actionable, location-specific predictions of forecast error for high-resolution operational numerical weather prediction (NWP) systems.

Paper Structure

This paper contains 36 sections, 2 equations, 29 figures, 1 table.

Figures (29)

  • Figure 1: The diagram illustrates the persistence method applied to an LSTM for HRRR forecast error prediction, using the NYSM, and analogously for the OKSM.
  • Figure 2: The diagram illustrates a high-level representation of the LSTM encoder-decoder workflow.
  • Figure 3: Confusion matrix summarizing the precision of LSTM predictions for precipitation points across the entire NYSM and forecast hours. Rows indicate the true condition, and columns indicate the LSTM’s prediction. More (less) precipitation translates to more (less) precipitation occurred than was forecast by the HRRR.
  • Figure 4: Scatterplot of the precipitation error across the NYSM network and all forecast hours, with the x-axis representing the true target error and the y-axis showing the corresponding LSTM-predicted error. The red diagonal line indicates the 1:1 line, where perfect predictions would lie.
  • Figure 5: NYSM MAE grouped by NCEI climate division NCEI2015. Each point represents the average LSTM performance (MAE) for an NYSM station, averaged over all forecast lead times. The magnitude of the point is proportional to the MAE, where larger points translate to higher MAE.
  • ...and 24 more figures