Table of Contents
Fetching ...

Prediction of the SYM-H Index Using a Bayesian Deep Learning Method with Uncertainty Quantification

Yasser Abduallah, Khalid A. Alobaid, Jason T. L. Wang, Haimin Wang, Vania K. Jordanova, Vasyl Yurchyshyn, Huseyin Cavus, Ju Jing

TL;DR

This work tackles the challenge of short-term SYM-H forecasting using high-temporal-resolution solar wind and IMF data. It introduces SYMHnet, a Bayesian deep learning framework that fuses a graph neural network to model inter-parameter relationships with a bidirectional LSTM to capture temporal dynamics, and employs Monte Carlo dropout to quantify both data and model uncertainty. On a dataset of 42 geomagnetic storms spanning 1998–2018, SYMHnet substantially outperforms related methods (e.g., GBM) in forecast skill, achieving, for example, FSS values of 0.343 (1-hour ahead, 5-minute data) and 0.553 (2-hours ahead) on large storms, while also providing meaningful uncertainty estimates. The results demonstrate robust performance for both 1-minute and 5-minute resolutions and suggest practical value for space weather forecasting with probabilistic forecasts and uncertainty quantification.

Abstract

We propose a novel deep learning framework, named SYMHnet, which employs a graph neural network and a bidirectional long short-term memory network to cooperatively learn patterns from solar wind and interplanetary magnetic field parameters for short-term forecasts of the SYM-H index based on 1-minute and 5-minute resolution data. SYMHnet takes, as input, the time series of the parameters' values provided by NASA's Space Science Data Coordinated Archive and predicts, as output, the SYM-H index value at time point t + w hours for a given time point t where w is 1 or 2. By incorporating Bayesian inference into the learning framework, SYMHnet can quantify both aleatoric (data) uncertainty and epistemic (model) uncertainty when predicting future SYM-H indices. Experimental results show that SYMHnet works well at quiet time and storm time, for both 1-minute and 5-minute resolution data. The results also show that SYMHnet generally performs better than related machine learning methods. For example, SYMHnet achieves a forecast skill score (FSS) of 0.343 compared to the FSS of 0.074 of a recent gradient boosting machine (GBM) method when predicting SYM-H indices (1 hour in advance) in a large storm (SYM-H = -393 nT) using 5-minute resolution data. When predicting the SYM-H indices (2 hours in advance) in the large storm, SYMHnet achieves an FSS of 0.553 compared to the FSS of 0.087 of the GBM method. In addition, SYMHnet can provide results for both data and model uncertainty quantification, whereas the related methods cannot.

Prediction of the SYM-H Index Using a Bayesian Deep Learning Method with Uncertainty Quantification

TL;DR

This work tackles the challenge of short-term SYM-H forecasting using high-temporal-resolution solar wind and IMF data. It introduces SYMHnet, a Bayesian deep learning framework that fuses a graph neural network to model inter-parameter relationships with a bidirectional LSTM to capture temporal dynamics, and employs Monte Carlo dropout to quantify both data and model uncertainty. On a dataset of 42 geomagnetic storms spanning 1998–2018, SYMHnet substantially outperforms related methods (e.g., GBM) in forecast skill, achieving, for example, FSS values of 0.343 (1-hour ahead, 5-minute data) and 0.553 (2-hours ahead) on large storms, while also providing meaningful uncertainty estimates. The results demonstrate robust performance for both 1-minute and 5-minute resolutions and suggest practical value for space weather forecasting with probabilistic forecasts and uncertainty quantification.

Abstract

We propose a novel deep learning framework, named SYMHnet, which employs a graph neural network and a bidirectional long short-term memory network to cooperatively learn patterns from solar wind and interplanetary magnetic field parameters for short-term forecasts of the SYM-H index based on 1-minute and 5-minute resolution data. SYMHnet takes, as input, the time series of the parameters' values provided by NASA's Space Science Data Coordinated Archive and predicts, as output, the SYM-H index value at time point t + w hours for a given time point t where w is 1 or 2. By incorporating Bayesian inference into the learning framework, SYMHnet can quantify both aleatoric (data) uncertainty and epistemic (model) uncertainty when predicting future SYM-H indices. Experimental results show that SYMHnet works well at quiet time and storm time, for both 1-minute and 5-minute resolution data. The results also show that SYMHnet generally performs better than related machine learning methods. For example, SYMHnet achieves a forecast skill score (FSS) of 0.343 compared to the FSS of 0.074 of a recent gradient boosting machine (GBM) method when predicting SYM-H indices (1 hour in advance) in a large storm (SYM-H = -393 nT) using 5-minute resolution data. When predicting the SYM-H indices (2 hours in advance) in the large storm, SYMHnet achieves an FSS of 0.553 compared to the FSS of 0.087 of the GBM method. In addition, SYMHnet can provide results for both data and model uncertainty quantification, whereas the related methods cannot.
Paper Structure (19 sections, 3 equations, 8 figures, 12 tables)

This paper contains 19 sections, 3 equations, 8 figures, 12 tables.

Figures (8)

  • Figure 1: Illustration of the parameter graphs constructed at time points $t$, $t$ + 1, $t$ + 2, respectively with a resolution of 1 minute for predicting the SYM-H index 1 hour in advance. Each graph contains seven parameters: IMF magnitude ($B$), $B_{y}$ component, $B_{z}$ component, electric field (EF), proton density (N_p), flow pressure (P_dyn), and flow speed (V). The colored values in the graphs represent the parameters' values that change as time goes on, while the topologies of the graphs remain the same. The value in the SYM-H node in a graph is the label of the graph. The FCG symbol in a graph indicates that the graph is fully connected.
  • Figure 2: The SYMHnet framework: (a) the overall architecture of SYMHnet, (b) the architecture of its GNN component, and (c) the architecture of its BiLSTM component. The input parameter graph is for illustration; the actual graph in the implementation is a fully connected graph (FCG). B = IMF magnitude (B), By = By component, Bz = Bz component, EF = Electric field, N_p = Proton density, P_dyn = Flow pressure, and V = Flow speed.
  • Figure 3: Predictions for storm #36 (top) and storm #37 (bottom) made by the SYMHnet model based on 1-minute resolution data. The red line represents the observed SYM-H values, the yellow dashed line represents the model's predictions, and the blue line represents the prediction error. Both quiet time and storm time are shown in the figure.
  • Figure 4: Uncertainty quantification results produced by the SYMHnet model in storm #36 (top) and storm #37 (bottom) based on 1-minute resolution data. The red line represents the observed SYM-H values, the yellow dashed line represents the model's predictions, the light-blue region shows epistemic uncertainty (model uncertainty), and the light-gray region shows aleatoric uncertainty (data uncertainty). Both quiet time and storm time are shown in the figure.
  • Figure 5: Predictions for storm #36 (top) and storm #37 (bottom) made by the SYMHnet model based on 5-minute resolution data. The red line represents the observed SYM-H values, the yellow dashed line represents the model's predictions, and the blue line represents the prediction error. Only the peak storm time is shown in the figure.
  • ...and 3 more figures