An Explainable Deep-learning Model of Proton Auroras on Mars

Dattaraj B. Dhuri; Dimitra Atri; Ahmed AlHantoobi

An Explainable Deep-learning Model of Proton Auroras on Mars

Dattaraj B. Dhuri, Dimitra Atri, Ahmed AlHantoobi

TL;DR

The paper presents a data-driven artificial neural network to reproduce Mars proton aurora Ly-$\alpha$ emission profiles using MAVEN/IUVS limb scans and in-situ MAVEN/SWIA and MAG data across SW, MS, and TH regions. By combining multiple input groups and employing a loss function that includes MSE, SSIM, and EM-specific terms, the model accurately reproduces Ly-$\alpha$ intensities and peak shapes (Pearson $r$ around 0.94 for intensities) and identifies the key drivers of proton aurora enhancements via SHAP analysis. SHAP results confirm known dependencies on solar longitude $L_s$ and solar zenith angle, reveal the influential role of CO$_2$ atmosphere proxies and the penetrating proton flux near $\sim 1$ keV, and highlight data biases that limit extreme-event generalization. The approach demonstrates the value of interpretable ML in planetary space physics, offering a computationally efficient tool to simulate Mars–solar wind interactions and guiding future data collection and physics-informed modeling efforts.

Abstract

Proton auroras are widely observed on the dayside of Mars, identified as a significant intensity enhancement in the hydrogen Lyman alpha (121.6 nm) emission between 110 - 150 km altitudes. Solar wind protons penetrating as energetic neutral atoms into Mars thermosphere are thought to be primarily responsible for these auroras. Recent observations of spatially localized (patchy) proton auroras suggest a possible direct deposition of protons into Mars atmosphere during unstable solar wind conditions. Improving our understanding of proton auroras is therefore important for characterizing the solar wind interaction with Mars atmosphere. Here, we develop a first purely data-driven model of proton auroras using Mars Atmosphere and Volatile EvolutioN (MAVEN) in-situ observations and limb scans of Ly-alpha emissions between 2014 - 2022. We train an artificial neural network (ANN) that reproduces individual Lyman alpha intensities and relative Lyman alpha peak intensity enhancements with a Pearson correlation of 0.94 and 0.60 respectively for the test data, along with a faithful reconstruction of the shape of the observed Lyman alpha emission altitude profiles. By performing a SHapley Additive exPlanations (SHAP) analysis, we find that solar zenith angle, solar longitude, CO2 atmosphere variability, solar wind speed and temperature are the most important features for the modeled Lyman alpha peak intensity enhancements. Additionally, we find that the modeled peak intensity enhancements are high for early local time hours, particularly near polar latitudes, as well as weaker induced magnetic fields. Through SHAP analysis, we also identify the influence of biases in the training data and interdependecies between the measurements used for the modeling, and an improvement on those aspects can significantly improve the performance and applicability of the ANN model.

An Explainable Deep-learning Model of Proton Auroras on Mars

TL;DR

The paper presents a data-driven artificial neural network to reproduce Mars proton aurora Ly-

emission profiles using MAVEN/IUVS limb scans and in-situ MAVEN/SWIA and MAG data across SW, MS, and TH regions. By combining multiple input groups and employing a loss function that includes MSE, SSIM, and EM-specific terms, the model accurately reproduces Ly-

intensities and peak shapes (Pearson

around 0.94 for intensities) and identifies the key drivers of proton aurora enhancements via SHAP analysis. SHAP results confirm known dependencies on solar longitude

and solar zenith angle, reveal the influential role of CO

atmosphere proxies and the penetrating proton flux near

keV, and highlight data biases that limit extreme-event generalization. The approach demonstrates the value of interpretable ML in planetary space physics, offering a computationally efficient tool to simulate Mars–solar wind interactions and guiding future data collection and physics-informed modeling efforts.

Abstract

Paper Structure (16 sections, 5 equations, 17 figures, 5 tables)

This paper contains 16 sections, 5 equations, 17 figures, 5 tables.

Introduction
Data
IUVS Limb-Scan Observations
MAVEN in-situ measurements of protons and magnetic field
Methods
Artificial neural network architecture
Training
Results
Accuracy of the modeled proton auroras
SHAP Values: Identifying important features
In-situ Measurements
CO2UVD Altitude Profiles
Proton Energy Spectra
Discussion
Data Distribtion in training, validation and test sets
...and 1 more sections

Figures (17)

Figure 1: Examples of MAVEN/IUVS limb scans showing the thermospheric altitude profiles of co2uvd emission (a) and Ly-${\rm \alpha}$ emissions (b). The Ly-${\rm \alpha}$ emissions show both a proton aurora case (red) and a non-proton aurora case i.e. the background dayglow emission (blue). The proton aurora profile shows the characteristic enhancement around 110 -- 150 km altitudes. Note that the co2uvd emission profiles between altitudes 130 -- 190 km are used in this study. The legend shows the orbit number for each profile. Distribution of the Ly-${\rm \alpha}$ enhancements within the data considered is shown in (c) (after Hughes2019, with darker red also indicating increasing intensity of proton aurora enhancements. The dashed line marks an intensity enhancement threshold, used only as a reference, for defining the proton auroras as per Hughes2019
Figure 2: Example of MAVEN/SWIA measurements for a sample orbit showing the variation in the proton energy spectra (a) and the MAVEN altitude (b). The bow-shock (black) and magnetic pile-up boundaries (red) marked by the dashed lines are from TROTIGNON2006357. The region within the dashed yellow line shows observations below 250 km identified as the thermosphere region in our analysis. (c) shows the energy spectra of protons within the thermosphere region as per Halekas2015.
Figure 3: The Artificial Neural Network (ANN) Architecture: The ANN takes in SWIA in-situ measurements of proton properties and magnetic fields from upstream solar wind (SW), magnetosheath (MS) and thermosphere (TH) regions, as well as remote sensing geometry measurements of each IUVS limb scan. These input features are summarised in Table \ref{['tab:features']}. Fully connected (FC) or 1D Convolutional Neural Network (CNN) sub-networks process the individual set of input features and yield an abstract representation. These representations are further processed by layers of fully connected neurons (hidden layers) to obtain the observed Ly-${\rm \alpha}$ altitude profile of each IUVS limb scan as the output. The details of the FC and 1D-CNN sub-networks, hidden layers and the output layer are given in the main text.
Figure 4: Performance of the ANN: (a) Mean absolute error in the predicted intensity as a function of true intensity binned in 10 equal sized bins. The error bars indicate 1${\rm \sigma}$ standard error. (b) (c) & (d) Heatmaps showing the population of predicted intensity samples binned in 2D as per true and predicted intensities for training, validation, and test sets respectively. The training data is used to obtain the model parameters that minimize the loss function (equation \ref{['eq:totloss']}). The validation data is used to ensure that the model performance can generalize to new data and obtain hyperparameters for the training. The test data is totally unseen by the model. Note that the bins lying closer to the diagonal (dashed line) indicate accurate predictions. The corresponding Pearson correlation values (r) are also noted. The observed intensities of Ly-${\rm alpha}$ emission below $\sim$ 9 kR are accurately reproduced by the ANN model.
Figure 5: Summary of reconstructed Ly-${\rm \alpha}$ intensity altitude profiles for the training, validation and test data. The training data is used to obtain the model parameters that minimize the loss function (equation \ref{['eq:totloss']}). The validation data is used to ensure that the model performance can generalize to new data and obtain hyperparameters for the training. The test data is totally unseen by the model. Each profile is a mean profile obtained from a population of profiles binned in percentile bins of the peak intensity enhancements. Darker color indicates increasing enhancement value. Each profile is normalized by mean intensity values between altitude range 160 -- 200 km (after Hughes2019). The characteristic shape of the observed proton aurora Ly-${\rm alpha}$ intensity profiles is reproduced reasonably well by the ANN model, except for the cases of extreme intensity enhancements.
...and 12 more figures

An Explainable Deep-learning Model of Proton Auroras on Mars

TL;DR

Abstract

An Explainable Deep-learning Model of Proton Auroras on Mars

Authors

TL;DR

Abstract

Table of Contents

Figures (17)