Transformers for Stratified Spectropolarimetric Inversion: Proof of Concept
Ryan James Campbell, Mihalis Mathioudakis, Carlos Quintero Noda
TL;DR
Transformers for Stratified Spectropolarimetric Inversion demonstrates a transformer model that maps full-Stokes spectra across 15 lines to atmospheric parameters as a function of optical depth $\log(\tau)$. The method leverages learned spectral positional embeddings and cross-attention to capture non-local spectral dependencies, trained on 3D MHD MANCHA synthetic data and benchmarked against an MLP baseline. Results show higher correlations and more regularized stratifications for magnetic parameters, with robustness to Gaussian noise typical of solar observations, and faster inference enabling large-scale analysis. Cross-attention analyses link the model's focus to physically diagnostic spectral regions, suggesting interpretable behavior and potential utility as an initial guess for inversions, with limitations including synthetic-only training, chromospheric realism, and absence of instrumental effects.
Abstract
Solar spectropolarimetric inversion -- inferring atmospheric conditions from the Stokes vector -- is a key diagnostic tool for understanding solar magnetism, but traditional inversion methods are computationally expensive and sensitive to local minima. Advances in artificial intelligence (AI) offer faster solutions, but are often restricted to shallow models or a few spectral lines. We present a proof-of-concept study using a transformer machine learning (ML) model for multi-line, full-Stokes inversion, to infer stratified parameters from synthetic spectra produced from 3D magnetohydrodynamic simulations. We synthesise a large set of Stokes vectors using forward modelling across 15 spectral lines spanning the deep photosphere towards the chromosphere. The model maps full-Stokes input to temperature, magnetic field strength, inclination, azimuth (encoded as $\sin2φ$, $\cos2φ$), and line-of-sight velocity as a function of optical depth. The transformer incorporates an attention mechanism that allows the model to focus on the most informative regions of the spectrum for each inferred parameter, and uses positional embedding to encode wavelength and depth order. We benchmark it against a multilayer perceptron (MLP), test robustness to noise, and assess generalisation. The transformer outperforms the MLP, especially in the higher layers and for magnetic parameters, yielding higher correlations and more regularised stratifications. The model retains strong performance across a range of noise levels typical for real observations, with magnetic parameter inference degrading predictably while temperature and velocity remain stable. We explore attention maps, linking the transformer's learned behaviour to line-formation physics.
