Table of Contents
Fetching ...

Towards Galactic Archaeology with Inferred Ages of Giant Stars From Gaia Spectra

Aisha S. Almannaei, Daisuke Kawata, Ioana Ciuca, Connor Fallows, Jason L. Sanders, George Seabroke, Andrea Miglio

TL;DR

This work establishes the feasibility of inferring stellar ages for giant stars directly from Gaia spectroscopy. It presents two SIDRA implementations: SIDRA-RVS, using Gaia RVS fluxes, and SIDRA-XP, using XP-derived stellar parameters, both trained on APOGEE/BINGO ages. SIDRA-XP achieves higher precision (~0.064 dex residuals at 10 Gyr) than SIDRA-RVS (~0.12 dex), enabling the authors to map the Galactic disc's chronology and chemistry across 2.2 million giants, recovering known structures such as Gaia-Sausage-Enceladus and hints of a gas-rich interaction around Sagittarius' first infall. The results demonstrate that Gaia spectra, when coupled with machine-learning inference trained on robust age calibrators, can yield valuable individual ages for giants and substantially enhance our understanding of the Milky Way's formation and evolution.

Abstract

In the era of Gaia, the accurate determination of stellar ages is transforming Galactic archaeology. We demonstrate the feasibility of inferring stellar ages from Gaia's RVS spectra and the BP/RP (XP) spectrophotometric data, specifically for red giant branch and high-mass red clump stars. We successfully train two machine learning models, dubbed SIDRA: Stellar age Inference Derived from Gaia spectRA to predict the age. The SIDRA-RVS model uses the RVS spectra and SIDRA-XP the stellar parameters obtained from the XP spectra. Both models use BINGO, an APOGEE-derived stellar age as the training data. SIDRA-RVS estimates ages of stars whose age is around $τ_\mathrm{BINGO}=10$~Gyr with a standard deviation of residuals of $\sim$ 0.12 dex in the unseen test dataset, while SIDRA-XP achieves higher precision with residuals $\sim$ 0.064 dex for stars around $τ_\mathrm{BINGO}=10$~ Gyr. Since SIDRA-XP outperforms SIDRA-RVS, we apply SIDRA-XP to analyse the ages for 2,218,154 stars. This allowed us to map the chronological and chemical properties of Galactic disc stars, reproducing the known distinct features such as the Gaia-Sausage-Enceladus merger and a potential gas-rich interaction event linked to the first infall of the Sagittarius dwarf galaxy. This study demonstrates that machine learning techniques applied to Gaia's spectra can provide valuable individual age information, particularly for giant stars, thereby enhancing our understanding of the Milky Way's formation and evolution.

Towards Galactic Archaeology with Inferred Ages of Giant Stars From Gaia Spectra

TL;DR

This work establishes the feasibility of inferring stellar ages for giant stars directly from Gaia spectroscopy. It presents two SIDRA implementations: SIDRA-RVS, using Gaia RVS fluxes, and SIDRA-XP, using XP-derived stellar parameters, both trained on APOGEE/BINGO ages. SIDRA-XP achieves higher precision (~0.064 dex residuals at 10 Gyr) than SIDRA-RVS (~0.12 dex), enabling the authors to map the Galactic disc's chronology and chemistry across 2.2 million giants, recovering known structures such as Gaia-Sausage-Enceladus and hints of a gas-rich interaction around Sagittarius' first infall. The results demonstrate that Gaia spectra, when coupled with machine-learning inference trained on robust age calibrators, can yield valuable individual ages for giants and substantially enhance our understanding of the Milky Way's formation and evolution.

Abstract

In the era of Gaia, the accurate determination of stellar ages is transforming Galactic archaeology. We demonstrate the feasibility of inferring stellar ages from Gaia's RVS spectra and the BP/RP (XP) spectrophotometric data, specifically for red giant branch and high-mass red clump stars. We successfully train two machine learning models, dubbed SIDRA: Stellar age Inference Derived from Gaia spectRA to predict the age. The SIDRA-RVS model uses the RVS spectra and SIDRA-XP the stellar parameters obtained from the XP spectra. Both models use BINGO, an APOGEE-derived stellar age as the training data. SIDRA-RVS estimates ages of stars whose age is around ~Gyr with a standard deviation of residuals of 0.12 dex in the unseen test dataset, while SIDRA-XP achieves higher precision with residuals 0.064 dex for stars around ~ Gyr. Since SIDRA-XP outperforms SIDRA-RVS, we apply SIDRA-XP to analyse the ages for 2,218,154 stars. This allowed us to map the chronological and chemical properties of Galactic disc stars, reproducing the known distinct features such as the Gaia-Sausage-Enceladus merger and a potential gas-rich interaction event linked to the first infall of the Sagittarius dwarf galaxy. This study demonstrates that machine learning techniques applied to Gaia's spectra can provide valuable individual age information, particularly for giant stars, thereby enhancing our understanding of the Milky Way's formation and evolution.

Paper Structure

This paper contains 13 sections, 2 equations, 14 figures.

Figures (14)

  • Figure 1: The Kiel diagram of our cross-matched dataset between the Ciuca2024 APOGEE data and Gaia DR3 RVS spectra data (left panel) and the FallowsSanders2024 XP stellar parameter data (right panel). The stellar parameters for the RVS data are derived from the APOGEE dataset, while those for the XP data come from FallowsSanders2024.
  • Figure 2: SIDRA-RVS age predictions, $\mathrm{log}_{10}(\tau_\mathrm{SIDRA}$$\mathrm{[Gyr]})$, versus the target BINGO age estimation in $\mathrm{log}_{10}(\tau_\mathrm{BINGO}$$\mathrm{[Gyr]})$ for the training set colour-coded by [M/H] (left panel) and [$\mathrm{\alpha/M}$] (right panel) obtained from APOGEE. The upper panels show the predictions versus the target and the black line indicates the identity line. The lower panels represent the residuals between the SIDRA-RVS's $\mathrm{log}_{10}(\tau_\mathrm{SIDRA}$$\mathrm{[Gyr]})$ prediction and BINGO's true $\mathrm{log}_{10}(\tau_\mathrm{BINGO}$$\mathrm{[Gyr]})$ denoted as $\mathrm{log}_{10}(\frac{\tau_\mathrm{SIDRA}}{\tau_\mathrm{BINGO}})$. The black filled circles and vertical error bars indicate the mean and the standard deviation of the residuals at the different $\mathrm{log}_{10}(\tau_\mathrm{BINGO}$$\mathrm{[Gyr]})$ bins, respectively.
  • Figure 3: SIDRA-RVS age predictions, $\mathrm{log}_{10}(\tau_\mathrm{SIDRA}$$\mathrm{[Gyr]})$ versus the target BINGO age estimation in $\mathrm{log}_{10}(\tau_\mathrm{BINGO}$$\mathrm{[Gyr]})$ for the testing set colour-coded by [M/H] (left panel) and [$\mathrm{\alpha/M}$] (right panel) obtained from APOGEE. The upper panels show the predictions versus the target and the black line indicates the identity line. The lower panels represent the residuals between the SIDRA-RVS's $\mathrm{log}_{10}(\tau_\mathrm{SIDRA}$$\mathrm{[Gyr]})$ prediction and BINGO's true $\mathrm{log}_{10}(\tau_\mathrm{BINGO}$$\mathrm{[Gyr]})$ denoted as $\mathrm{log}_{10}(\frac{\tau_\mathrm{SIDRA}}{\tau_\mathrm{BINGO}})$. The black filled circles and vertical error bars indicate the mean and the standard deviation of the residuals at the different $\mathrm{log}_{10}(\tau_\mathrm{BINGO}$$\mathrm{[Gyr]})$ bins, respectively.
  • Figure 4: SHAP bee-swarm plot. Each row represents input features at the indicated wavelengths, arranged by significance from top to bottom. Within each row, every point represents a star in the testing dataset, colour-coded by its normalised feature value. The placement of each point illustrates the extent and direction of each feature's influence on its output label, the stellar age, $\mathrm{log}_{10}(\tau_\mathrm{SIDRA}$$\mathrm{[Gyr]})$.
  • Figure 5: Example of a high signal-to-noise ratio (SNR) spectrum of an observed star from Gaia DR3 (DR3 Star ID-1222988540219279360) shown in two wavelength regions: 846-858 nm (top panel) and 858-870 nm (bottom panel). The red dots on the spectrum represent the top 10 highest SHAP values from our SIDRA-RVS model results, overlaid with atomic lines from RecioBlanco2023. The width of the atomic lines corresponds to its uncertainty. The vertical blue lines at 865.09 in the bottom panel indicate SiI line identified by Contursi2021. The stellar parameters, $\log_{10}{(T_\mathrm{eff})}$, $\log~g$, $[\mathrm{\alpha/M}]$ are obtained from APOGEE.
  • ...and 9 more figures