Table of Contents
Fetching ...

Local wind speed forecasting at short time horizons based on Numerical Weather Prediction and observations from surrounding stations

Roberta Baggio, Killian Pujol, Florian Pantillon, Dominique Lambert, Jean-Baptiste Filippi, Jean-François Muzy

TL;DR

The paper addresses the challenge of accurate short-term wind speed forecasting in complex terrain by blending Numerical Weather Prediction outputs (ARPEGE and AROME) with nearby ground-station observations through a hybrid neural network. It introduces deterministic and probabilistic forecasting via a three-branch architecture and an M-Rice distribution to capture uncertainty and extreme events, achieving up to ~30% RMSE improvement over raw NWP baselines. The study demonstrates that the hybrid approach outperforms baselines across 278 stations and that probabilistic forecasts further enhance extreme-event prediction, with notable gains from fine-tuning at Corsican sites. Operational feasibility is highlighted through a low-latency inference pipeline, supporting real-time wind energy and safety applications, and the work points to future extensions including ensemble-NWP, hub-height extrapolation, and transformer-based spatiotemporal models.

Abstract

This study presents a hybrid neural network model for short-term (1-6 hours ahead) surface wind speed forecasting, combining Numerical Weather Prediction (NWP) with observational data from ground weather stations. It relies on the MeteoNet dataset, which includes data from global (ARPEGE) and regional (AROME) NWP models of the French weather service and meteorological observations from ground stations in the French Mediterranean. The proposed neural network architecture integrates recent past station observations (over last few hours) and AROME and ARPEGE predictions on a small subgrid around the target location. The model is designed to provide both deterministic and probabilistic forecasts, with the latter predicting the parameters of a suitable probability distribution that notably allows us to capture extreme wind events. Our results demonstrate that the hybrid model significantly outperforms baseline methods, including raw NWP predictions, persistence models, and linear regression, across all forecast horizons. For instance, the model reduces RMSE by up 30\% compared to AROME predictions. Probabilistic forecasting further enhances performance, particularly for extreme quantiles, by estimating conditional quantiles rather than relying solely on the conditional mean. Fine-tuning the model for specific stations, such as those in the Mediterranean island of Corsica, further improves forecasting accuracy. Our study highlights the importance of integrating multiple data sources and probabilistic approaches to improve short-term wind speed forecasting. It defines an effective approach, even in a complex terrain like Corsica where localized wind variations are significant

Local wind speed forecasting at short time horizons based on Numerical Weather Prediction and observations from surrounding stations

TL;DR

The paper addresses the challenge of accurate short-term wind speed forecasting in complex terrain by blending Numerical Weather Prediction outputs (ARPEGE and AROME) with nearby ground-station observations through a hybrid neural network. It introduces deterministic and probabilistic forecasting via a three-branch architecture and an M-Rice distribution to capture uncertainty and extreme events, achieving up to ~30% RMSE improvement over raw NWP baselines. The study demonstrates that the hybrid approach outperforms baselines across 278 stations and that probabilistic forecasts further enhance extreme-event prediction, with notable gains from fine-tuning at Corsican sites. Operational feasibility is highlighted through a low-latency inference pipeline, supporting real-time wind energy and safety applications, and the work points to future extensions including ensemble-NWP, hub-height extrapolation, and transformer-based spatiotemporal models.

Abstract

This study presents a hybrid neural network model for short-term (1-6 hours ahead) surface wind speed forecasting, combining Numerical Weather Prediction (NWP) with observational data from ground weather stations. It relies on the MeteoNet dataset, which includes data from global (ARPEGE) and regional (AROME) NWP models of the French weather service and meteorological observations from ground stations in the French Mediterranean. The proposed neural network architecture integrates recent past station observations (over last few hours) and AROME and ARPEGE predictions on a small subgrid around the target location. The model is designed to provide both deterministic and probabilistic forecasts, with the latter predicting the parameters of a suitable probability distribution that notably allows us to capture extreme wind events. Our results demonstrate that the hybrid model significantly outperforms baseline methods, including raw NWP predictions, persistence models, and linear regression, across all forecast horizons. For instance, the model reduces RMSE by up 30\% compared to AROME predictions. Probabilistic forecasting further enhances performance, particularly for extreme quantiles, by estimating conditional quantiles rather than relying solely on the conditional mean. Fine-tuning the model for specific stations, such as those in the Mediterranean island of Corsica, further improves forecasting accuracy. Our study highlights the importance of integrating multiple data sources and probabilistic approaches to improve short-term wind speed forecasting. It defines an effective approach, even in a complex terrain like Corsica where localized wind variations are significant

Paper Structure

This paper contains 30 sections, 27 equations, 15 figures, 5 tables.

Figures (15)

  • Figure 1: (a) Geographical extent of the MeteoNet Southeast database, with the 278 ground station localizations ($\bullet$). (b) Location and names of the 21 target ground stations in Corsica.
  • Figure 2: Schematic representation of the neural network architecture used to process station, AROME, and ARPEGE data. The output shape of each layer is indicated below the corresponding block. "F.C." denotes a fully connected layer, "Flat." stands for a flattening operation, "Conc." indicates a concatenation of features, and $LSTM^*$ refers to an LSTM layer that returns the full output sequence rather than only the final hidden state.
  • Figure 3: Comparison of the performance of the ${\mathcal{M}}$ model (symbols ($\blacksquare$)) against various baseline approaches described in Secs. \ref{['sec:model_arch']} and \ref{['sec:baselines']}. The Root Mean Square Error (RMSE), expressed in $ms^{-1}$ and defined in Eq. \ref{['eq:rmse']}, is computed over the test dataset and reported for each forecast horizon ranging from 1 to 6 hours.
  • Figure 4: Illustration of the ${\mathcal{M}}$ (symbols ($\bullet$)) model and the raw AROME forecasting performances ($\blacksquare$) in Ajaccio and Bonifacio for horizon $h$$=$$6$ hours ahead prediction of wind speed value at valid time $t_h$$=$$t$$+$$h$$=$$1100$ UTC each day. Measured wind speed value as indicated by the black solid lines. Panels (a) and (c) display daily forecasts over the full 2016–2018 period, while panels (b) and (d) provide a zoomed view of a representative 4-month sub-period (15 January 2016 to 15 May 2016) to enhance visual clarity. Although the overall performance of the two models appears comparable in (a) and (c), the raw AROME forecasts exhibit a slight bias toward lower wind speeds. One can also observe strong seasonal effects in Ajaccio where breeze regimes are much more important than in Bonifacio. The zoomed views in (b) and (d) confirm that the ${\mathcal{M}}$ model outperforms the raw AROME forecasts quite frequently, on a day-to-day basis, indicating consistent statistical superiority.
  • Figure 5: RMSE (in $m s^{-1}$) associated with the ${\mathcal{M}}$ prediction as a function of the station mean wind speed value (estimated over 3 years) $\overline{V}^s$ for all sites for horizons $h$ = $1$ hour (symbols ($\bullet$)) and $h$ = $6$ hours (symbols ($\bigstar$)).
  • ...and 10 more figures