Investigating the Efficacy of Topologically Derived Time Series for Flare Forecasting. II. XGBoost Model
Thomas Williams, Christopher B. Prior, David MacTaggart, D. Shaun Bloomfield
TL;DR
This work examines solar flare forecasting using time-dependent topological magnetic parameters derived from the ARTop framework, focusing on current-carrying versus potential topology as predictive signals. An XGBoost classifier is trained on a rich feature set that includes delta-topology inputs, accumulated winding/helicity, velocity-weighted terms, lagged descriptors, kurtosis, and flare history, with a 24-hour forecast horizon. On a validation set, the model achieves a True Skill Statistic of $0.804$ and high accuracy, while a fully independent holdout set yields a more modest $TSS = 0.524$, highlighting challenges from limb projection effects and frequent C-class flares. SHAP analysis confirms the physical interpretability of the model by identifying flare history and accumulated current-carrying winding/helicity as key predictors, and the study discusses practical steps to improve deployability, such as extrapolations for regions entering the disk and multi-model approaches.
Abstract
Solar flares are a primary driver of space weather, and forecasting their occurrence remains a significant challenge. This paper presents a novel flare prediction model based on topologically derived photospheric magnetic parameters. We employ the \texttt{ARTop} framework to compute the time-dependent input rates of magnetic winding and helicity across more than $10^5$ active region (AR) observations, decomposing them into current-carrying and potential components to reduce sensitivity to optical flow methods. An \texttt{XGBoost} machine learning model is trained on these topological time series, alongside engineered features including rolling statistics, kurtosis, and flare history, to predict the probability of $\geq$M1.0-class flares within the next 24 hours. The model demonstrates strong performance on a validation set, with a True Skill Statistic (TSS) of 0.804 for once daily operational region forecasts. When applied to a fully independent holdout set, the operational forecast achieves a TSS of \tsssa. A SHapley Additive exPlanations (SHAP) analysis confirms the model's physical interpretability, identifying flare history and accumulated current-carrying winding and helicity as the most important features. The main challenges identified are false positives arising from ARs with frequent C-class flaring and systematic errors introduced by projection effects when ARs are near the limb. Excluding limb-affected data yields no improvement in the holdout set TSS (\TSSalert\ versus \tsssa), due to the overall decreased number of flares. However, our per-region analysis indicates that mitigating these projection effects is crucial for future operational deployment. This work establishes magnetic topology, particularly its current-carrying components, as a highly effective and physically meaningful set of predictors for solar flare forecasting.
