A Generalist Model Including Evolved Star Mass and Age

Mengmeng Zhang; Yude Bu; Siqi Wang; Shanshan Li; Jiangchuan Zhang; Jingzhen Sun; Yuhang Zhang; Ke Wang; Jian Liu; Hongliang Yan; Zhenping Yi; Meng Liu; Xiaoming Kong

A Generalist Model Including Evolved Star Mass and Age

Mengmeng Zhang, Yude Bu, Siqi Wang, Shanshan Li, Jiangchuan Zhang, Jingzhen Sun, Yuhang Zhang, Ke Wang, Jian Liu, Hongliang Yan, Zhenping Yi, Meng Liu, Xiaoming Kong

Abstract

Determining precise stellar ages and masses for evolved giants is crucial for Galactic archaeology but challenged by spectral degeneracies. Gaia's low-resolution XP spectra offer a unique opportunity to infer these parameters on a massive scale using data-driven methods. We extend a transformer-based astronomical foundation model to evolved stars, establishing a unified framework to simultaneously predict atmospheric parameters ($T_{\mathrm{eff}}$, $\log g$, $[\mathrm{M}/\mathrm{H}]$) and evolutionary labels (mass, age) with physical consistency. Treating spectra as token sequences, we integrated mass and age into the model's vocabulary. The model is trained on Gaia XP spectra cross-matched with the APOGEE DR17 DistMass catalog. Our generative approach enables flexible input handling, including spectral inpainting and parameter-to-spectrum generation. On an independent test set, the model achieves a prediction scatter of $σ\approx 0.114 \, M_{\odot}$ for mass and $σ\approx 1.334$ Gyr for age. Beyond numerical accuracy, it successfully reproduces the giant branch's mass-luminosity relation and autonomously disentangles interstellar extinction from intrinsic temperature variations without explicit physical priors. It also robustly recovers missing spectral data and estimates reliable uncertainties. Validating that foundation models can internalize stellar physics from data, this physically-aware, probabilistic framework offers a powerful tool for unraveling Milky Way history using large-scale spectroscopic surveys.

A Generalist Model Including Evolved Star Mass and Age

Abstract

) and evolutionary labels (mass, age) with physical consistency. Treating spectra as token sequences, we integrated mass and age into the model's vocabulary. The model is trained on Gaia XP spectra cross-matched with the APOGEE DR17 DistMass catalog. Our generative approach enables flexible input handling, including spectral inpainting and parameter-to-spectrum generation. On an independent test set, the model achieves a prediction scatter of

for mass and

Gyr for age. Beyond numerical accuracy, it successfully reproduces the giant branch's mass-luminosity relation and autonomously disentangles interstellar extinction from intrinsic temperature variations without explicit physical priors. It also robustly recovers missing spectral data and estimates reliable uncertainties. Validating that foundation models can internalize stellar physics from data, this physically-aware, probabilistic framework offers a powerful tool for unraveling Milky Way history using large-scale spectroscopic surveys.

Paper Structure (32 sections, 5 equations, 19 figures)

This paper contains 32 sections, 5 equations, 19 figures.

Introduction
Data preparation
Base Observational Data
Stellar mass and age Labels
Data preprocessing and vector construction
Sample selection and statistics
Methods
Transformer-based encoder-decoder architecture
Embedding space and parameter integration
Stochastic subsampling and dynamic context
Training and optimization
Inference and evaluation metrics
Results
Spectra to stellar parameters
Atmospheric parameters and robustness tests
...and 17 more sections

Figures (19)

Figure 1: Properties of the training data set constructed for the foundation model. Top left: The Kiel diagram ($T_{\text{eff}}$ vs. $\log g$) for sources with valid APOGEE labels, color-coded by $\mathrm{[M/H]}$. The sample is dominated by giants. Top right: Sky distribution of the training sources in Galactic coordinates $(l, b)$, color-coded by the interstellar extinction $E(B-V)$ from the Combined19 map. The distribution traces the APOGEE survey footprint. Bottom left: The fraction of stars for which the $n$-th Gaia XP coefficient is marked as 'relevant' by the Gaia pipeline. The vertical dashed lines indicate the median number of relevant coefficients for BP (blue) and RP (orange). Bottom right: Distributions of Gaia parallax (left sub-panel) and $G$-band magnitude (right sub-panel) for the training sample.
Figure 2: The joint distribution of stellar mass, age, and $[\mathrm{M/H}]$ for the training sample after quality cuts. Main panel: Scatter plot of stellar age versus mass, color-coded by APOGEE $[\mathrm{M/H}]$. The correlation between mass, age, and $[\mathrm{M/H}]$ reflects the expected Galactic chemical evolution trends. Top panel: The marginal kernel density estimation (KDE) of stellar mass. The distribution is strongly peaked around $\sim 1.1\,M_{\odot}$, corresponding to the dominant population of RC stars. Right panel: The marginal KDE of stellar age, showing a broad coverage of the Galactic star formation history from young ($<1\,\mathrm{Gyr}$) to old ($>10\,\mathrm{Gyr}$) populations.
Figure 3: Schematic architecture of the Transformer-based foundation model tailored for evolved giants. Left: Embedding Module. The model ingests a heterogeneous list of inputs, comprising: (1) Gaia XP spectroscopic coefficients; (2) stellar colors (e.g., $G_{BP}-G_{RP}$, $J-H$); (3) atmospheric parameters ($T_{\text{eff}}$, $\log g$, $[\mathrm{M/H}]$); and (4) evolutionary parameters (explicitly including mass and age). Values and their corresponding token names are mapped into a latent space via a non-linear embedding layer. Top Right: Encoder. It processes the sequence of embedded observations using multi-head self-attention to capture the dependencies between spectra, atmospheric parameters, and evolutionary labels, generating a context vector representing the star. Bottom Right: Decoder. It receives a specific "information request" token (e.g., querying "mass") and queries the encoder's context via cross-attention. The output yields a probabilistic prediction, consisting of a scalar mean value and its associated uncertainty.
Figure 4: Results for predicting stellar atmospheric parameters ($T_{\text{eff}}$, $\log g$, and $\text{[M/H]}$) from different combinations of XP spectra and photometry. (a) An expertly trained XGBoost baseline model using standard inputs. (b) Our Transformer-based model with the same inputs, showing improved precision. (c) Performance using only the first 5 BP/RP coefficients. (d) Performance using a random subset of 64 XP coefficients and colors. (e) A stress test using uninformative high-order coefficients, where the model correctly predicts large uncertainties. (f) An identity recovery test where ground truth labels are mixed into the inputs.
Figure 5: Results for predicting evolutionary parameters (mass and age). (a) The XGBoost baseline model performance. (b) Our Transformer-based model demonstrates superior performance, achieving $\sigma_{\text{mass}} = 0.114\,M_{\odot}$ and $\sigma_{\text{age}} = 1.334\,\mathrm{Gyr}$. Panels (c) through (f) show the model's robustness on mass and Age inference under varying input conditions.
...and 14 more figures

A Generalist Model Including Evolved Star Mass and Age

Abstract

A Generalist Model Including Evolved Star Mass and Age

Authors

Abstract

Table of Contents

Figures (19)