Table of Contents
Fetching ...

Discovering Governing Equations of Geomagnetic Storm Dynamics with Symbolic Regression

Stefano Markidis, Jonah Ekelund, Luca Pennati, Andong Hu, Ivy Peng

TL;DR

The paper tackles forecasting geomagnetic storm evolution by deriving data-driven, interpretable $dDst/dt$ equations from OMNIweb solar wind data using symbolic regression with PySR. It presents a hierarchy of closed-form models, incorporating key drivers such as the convective electric field $E_y$ and dynamic pressure $P_{dyn}$, and benchmarks them against traditional empirical models (BMR and OBM). Across 2,000 initial conditions and multiple storm events, the symbolic-regression models generally outperform classical models in moderate to intense storms, with the best results capturing nonlinearities and threshold effects; extreme events require the most complex expressions. The approach delivers physically interpretable expressions, enabling insight into magnetospheric dynamics and offering a practical tool for space weather prediction with potential for real-time applications.

Abstract

Geomagnetic storms are large-scale disturbances of the Earth's magnetosphere driven by solar wind interactions, posing significant risks to space-based and ground-based infrastructure. The Disturbance Storm Time (Dst) index quantifies geomagnetic storm intensity by measuring global magnetic field variations. This study applies symbolic regression to derive data-driven equations describing the temporal evolution of the Dst index. We use historical data from the NASA OMNIweb database, including solar wind density, bulk velocity, convective electric field, dynamic pressure, and magnetic pressure. The PySR framework, an evolutionary algorithm-based symbolic regression library, is used to identify mathematical expressions linking dDst/dt to key solar wind. The resulting models include a hierarchy of complexity levels and enable a comparison with well-established empirical models such as the Burton-McPherron-Russell and O'Brien-McPherron models. The best-performing symbolic regression models demonstrate superior accuracy in most cases, particularly during moderate geomagnetic storms, while maintaining physical interpretability. Performance evaluation on historical storm events includes the 2003 Halloween Storm, the 2015 St. Patrick's Day Storm, and a 2017 moderate storm. The results provide interpretable, closed-form expressions that capture nonlinear dependencies and thresholding effects in Dst evolution.

Discovering Governing Equations of Geomagnetic Storm Dynamics with Symbolic Regression

TL;DR

The paper tackles forecasting geomagnetic storm evolution by deriving data-driven, interpretable equations from OMNIweb solar wind data using symbolic regression with PySR. It presents a hierarchy of closed-form models, incorporating key drivers such as the convective electric field and dynamic pressure , and benchmarks them against traditional empirical models (BMR and OBM). Across 2,000 initial conditions and multiple storm events, the symbolic-regression models generally outperform classical models in moderate to intense storms, with the best results capturing nonlinearities and threshold effects; extreme events require the most complex expressions. The approach delivers physically interpretable expressions, enabling insight into magnetospheric dynamics and offering a practical tool for space weather prediction with potential for real-time applications.

Abstract

Geomagnetic storms are large-scale disturbances of the Earth's magnetosphere driven by solar wind interactions, posing significant risks to space-based and ground-based infrastructure. The Disturbance Storm Time (Dst) index quantifies geomagnetic storm intensity by measuring global magnetic field variations. This study applies symbolic regression to derive data-driven equations describing the temporal evolution of the Dst index. We use historical data from the NASA OMNIweb database, including solar wind density, bulk velocity, convective electric field, dynamic pressure, and magnetic pressure. The PySR framework, an evolutionary algorithm-based symbolic regression library, is used to identify mathematical expressions linking dDst/dt to key solar wind. The resulting models include a hierarchy of complexity levels and enable a comparison with well-established empirical models such as the Burton-McPherron-Russell and O'Brien-McPherron models. The best-performing symbolic regression models demonstrate superior accuracy in most cases, particularly during moderate geomagnetic storms, while maintaining physical interpretability. Performance evaluation on historical storm events includes the 2003 Halloween Storm, the 2015 St. Patrick's Day Storm, and a 2017 moderate storm. The results provide interpretable, closed-form expressions that capture nonlinear dependencies and thresholding effects in Dst evolution.

Paper Structure

This paper contains 10 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Typical Dst evolution during a geomagnetic storm.
  • Figure 2: Pairplot of the primary variables considered in this study: $d\text{Dst}/dt$, DST_prev, $P_{\rm dyn}$, $E_y$, and $P_B$. The diagonal subplots display histograms or density plots for each variable, while the off-diagonal subplots show pairwise relationships.
  • Figure 3: A diagram showing the methodology for discovering mathematical models that predict dDst/dt. The data includes data input, symbolic regression with varying hyperparameters, and final model selection based on evaluation metrics.
  • Figure 4: RMSE and MAE comparison with standard deviation for data-driven Dst/dt models of different complexities. The error bars indicate the standard deviation across the test dataset. The red bars show the five best-performing models (lowest RMSE and MAE).
  • Figure 5: Comparison of Dst predictions for the extreme Halloween Storm (October 29--30, 2003). The top panel shows the predicted vs. actual Dst, while the bottom panel presents RMSE and MAE errors.
  • ...and 2 more figures