Elucidating the Grey Atmosphere: SHAP Value Analysis of a Random Forest Atmospheric Neutral Density Model
C. Bard, K. Murphy, A. Halford
TL;DR
This work addresses the interpretability gap in ML-based thermospheric density forecasting by applying TreeSHAP to the RANDM random forest model. The analysis demonstrates that solar irradiance, particularly the 43 nm FISM2 band, largely drives density changes, while geomagnetic activity (SYM-H) increasingly influences predictions during storms, with a practical threshold at $SYM\text{-}H< -60$ nT defining storm-time. The study also reveals day-side and dusk density enhancements, a dawn-dusk asymmetry, and informative local/global interaction patterns that connect model behavior to known space weather physics. Overall, the approach provides interpretable insights, highlights feature redundancies, and supports targeted model refinements for improved predictability of thermospheric density forecasts.
Abstract
We apply SHAP (SHapley Additive exPlanations) analysis using the TreeSHAP algorithm to a Random Forest model (RANDM) designed to predict thermospheric neutral density based on solar-terrestrial data. The analysis shows that RANDM identifies solar irradiance as a significant predictor of thermospheric density. Additionally, the model differentiates between magnetic local times, finding that dusk sectors have higher densities than dawn sectors, in line with prior research. When comparing storm and quiet-time conditions, we find these trends persist regardless of geomagnetic activity levels. The analysis further demonstrates that larger geomagnetic disturbances during storms, as parameterized by the SYM-H index, are associated with higher neutral densities. Notably, SYM-H begins to have the overall largest contribution to density prediction among model inputs at a threshold of -60 nT. This suggests a quantitative definition where ``storm-time'' begins at SYM-H $< -60$ nT. Overall, using TreeSHAP enhances our understanding of the factors influencing thermospheric density and demonstrates the value of explainable machine learning techniques in space weather research, enabling more interpretable models.
