SPT: Spectral Transformer for Red Giant Stars Age and Mass Estimation
Mengmeng Zhang, Fan Wu, Yude Bu, Shanshan Li, Zhenping Yi, Meng Liu, Xiaoming Kong
TL;DR
We address the problem of estimating red giant ages and masses from spectroscopic data without heavy reliance on isochrone degeneracy or long-term asteroseismic data. Our approach, the Spectral Transformer (SPT), uses a Multi-head Hadamard Self-Attention backbone with a Mahalanobis distance loss and Monte Carlo dropout to predict age and mass directly from spectra; trained on 3,880 LAMOST DR9 red giant spectra with asteroseismic labels, it achieves $\Delta_P=17.64\%$ for age and $\Delta_P=6.61\%$ for mass, outperforming several baselines. The model provides per-prediction uncertainties, and its results are consistent with asteroseismology and isochrone benchmarks, including open clusters; this enables more robust Galactic archaeology studies. Future work will leverage CSST and LSST datasets to further improve accuracy and applicability.
Abstract
The age and mass of red giants are essential for understanding the structure and evolution of the Milky Way. Traditional isochrone methods for these estimations are inherently limited due to overlapping isochrones in the Hertzsprung-Russell diagram, while asteroseismology, though more precise, requires high-precision, long-term observations. In response to these challenges, we developed a novel framework, Spectral Transformer (SPT), to predict the age and mass of red giants aligned with asteroseismology from their spectra. A key component of SPT, the Multi-head Hadamard Self-Attention mechanism, designed specifically for spectra, can capture complex relationships across different wavelength. Further, we introduced a Mahalanobis distance-based loss function to address scale imbalance and interaction mode loss, and incorporated Monte Carlo dropout for quantitative analysis of prediction uncertainty.Trained and tested on 3,880 red giant spectra from LAMOST, the SPT achieved remarkable age and mass estimations with average percentage errors of 17.64% and 6.61%, respectively, and provided uncertainties for each corresponding prediction. The results significantly outperform those of traditional machine learning algorithms and demonstrate a high level of consistency with asteroseismology methods and isochrone fitting techniques. In the future, our work will leverage datasets from the Chinese Space Station Telescope and the Large Synoptic Survey Telescope to enhance the precision of the model and broaden its applicability in the field of astronomy and astrophysics.
