Table of Contents
Fetching ...

Post-processing Probabilistic Forecasts of the Solar Wind by Data Mining Similar Scenarios

Daniel E. da Silva, Yash Parlikar, Shaela I. Jones, Charles N. Arge

Abstract

The solar wind speed at Earth is one of the most important parameters regarding the effects of space weather on society. Thus far, most approaches for predicting the solar wind speed produce a single-value time series without uncertainty, or utilize ensemble methods which require custom calibration development. In this study, a method is developed that produces calibrated probabilistic forecasts of the solar wind speed using skew normal distributions and a novel extension of analog ensembles. In our extension, the single-value predictions from a baseline model of the next $Δt$ days are used along with $Δwindow$ hours of recent observations and single-value predictions to create a forecasting scenario vector that is compared against a historical database for outcomes. The baseline model used is the combined Air Force Data Assimilative Photospheric Flux Transport-Wang Sheeley Arge (ADAPT-WSA) model and the WSA point parcel simulation, but the method is directly applicable to other deterministic models including components such as Enlil or the Heliospheric Upwind Extrapolation with time dependence model (HUXt). The approach works notably well on the benchmark of whether observations fall within the $p^{th}$ percentile $p\%$ of the time (for $p$ between 0 and 100). Falling back on the mean or median of the predicted distribution as a non-probabilistic prediction yields a direct improvement in root-mean-square error (RMSE) over the original WSA point parcel simulation, and is shown to beat $\approx$ 1 solar rotation recurrence for 1-5 day ahead forecasts.

Post-processing Probabilistic Forecasts of the Solar Wind by Data Mining Similar Scenarios

Abstract

The solar wind speed at Earth is one of the most important parameters regarding the effects of space weather on society. Thus far, most approaches for predicting the solar wind speed produce a single-value time series without uncertainty, or utilize ensemble methods which require custom calibration development. In this study, a method is developed that produces calibrated probabilistic forecasts of the solar wind speed using skew normal distributions and a novel extension of analog ensembles. In our extension, the single-value predictions from a baseline model of the next days are used along with hours of recent observations and single-value predictions to create a forecasting scenario vector that is compared against a historical database for outcomes. The baseline model used is the combined Air Force Data Assimilative Photospheric Flux Transport-Wang Sheeley Arge (ADAPT-WSA) model and the WSA point parcel simulation, but the method is directly applicable to other deterministic models including components such as Enlil or the Heliospheric Upwind Extrapolation with time dependence model (HUXt). The approach works notably well on the benchmark of whether observations fall within the percentile of the time (for between 0 and 100). Falling back on the mean or median of the predicted distribution as a non-probabilistic prediction yields a direct improvement in root-mean-square error (RMSE) over the original WSA point parcel simulation, and is shown to beat 1 solar rotation recurrence for 1-5 day ahead forecasts.
Paper Structure (8 sections, 8 equations, 9 figures, 3 tables, 1 algorithm)

This paper contains 8 sections, 8 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: Illustration of probabilistic solar wind prediction and adaptive uncertainty using the method of this article, for predictions at Earth using a single ADAPT map (realization 0) and 3-day-ahead predictions. Within this time interval is a stream interaction region predicted by ADAPT-WSA and observed by ACE around 2010-05-20. The bottom two panels show the probabilistic predictions at Time A (2010-05-21, 12 UT) and Time B (2010-05-27, 12 UT).
  • Figure 2: Modeling outputs from the WSA Solar Wind Model, displaying the magnetic topology (top panel) and the photospheric solar wind sources along the subsatellite track along with the speeds assigned within derived coronal holes (bottom panel). The bottom plot shows large coronal holes at poles, with some smaller coronal holes at lower latitudes. The black lines in the bottom panel connect the Earth subsatellite point with the coronal hole the parcel originated from.
  • Figure 3: Illustration of a reference representative “scenario” vector (left panel #1), and success finding similar scenarios (left panels #2 and #3) from other periods within the dataset of historical records (2010-2020). The plot on the right shows the increase in distance to the neighbor (black line) and drop-off in weight (red line) as neighbor index increases. The first neighbors are closest distance-wise to the reference scenario, and are weighted more highly accordingly (unnormalized weight $z=1/d^2$).
  • Figure 4: A graphical representation of skewed normal distributions for varying values of $\alpha$ with fixed location $\xi= 500~\mathrm{km/s}$ and scale $\omega=100~\mathrm{km/s}$. The first two parameters of the skew distribution, $\xi$ and $\omega$, are conceptually analogous to the normal distribution mean ($\mu$) and standard deviation ($\sigma$) in terms of controlling its shape. The third parameters $\alpha$ is a new parameter which controls skew and asymmetry. We note that though $\xi$ and $\omega$ have similar effects for shaping the distribution, $\xi$ is not precisely the mean, $\sigma$ not precisely the standard deviation, and $\alpha$ does not follow the formal definition of statistical skew.
  • Figure 5: Comparison of Percentile Efficiency (what percentage of observations fall within the predicted percentile range) for the model and simple baseline. The x-axis shows the target percentile for the error bars at each timestep. The y-axis axis shows the rate at which observations fell within those predicted error bars. For a flawless model, the line would be a perfect diagonal. Here, the method of this manuscript (left panel) has much better alignment to the perfect case than the simple baseline (right panel).
  • ...and 4 more figures