A method to derive self-consistent NLTE astrophysical parameters for 4 million high-resolution 4MOST stellar spectra in half a day with invertible neural networks

Victor F. Ksoll; Nicholas Storm; Maria Bergemann; Katherine Lee; Ralf S. Klessen; R. Albarracín; Guillaume Guiglion; Gražina Tautvaišienė

A method to derive self-consistent NLTE astrophysical parameters for 4 million high-resolution 4MOST stellar spectra in half a day with invertible neural networks

Victor F. Ksoll, Nicholas Storm, Maria Bergemann, Katherine Lee, Ralf S. Klessen, R. Albarracín, Guillaume Guiglion, Gražina Tautvaišienė

TL;DR

The paper introduces a simulation-based NLTE inference framework using a conditional invertible neural network (cINN) trained on a large grid of NLTE Turbospectrum spectra to derive self-consistent stellar parameters and chemical abundances from high-resolution 4MOST-like spectra. The cINN yields full posterior distributions, enabling intrinsic uncertainty quantification and revealing degeneracies, while delivering orders-of-magnitude faster inference than traditional LTE approaches. On synthetic NLTE data at S/N ≈ 250 Å−1, the method achieves low biases and competitive scatter for Teff, log g, [Fe/H], and key abundance ratios, with uncertainties increasing at lower S/N and in metal-poor regimes. Validation against Gaia-ESO/PLATO/4MOST benchmark stars shows good agreement with independent TSFitPy results after bias calibration, supporting scalable NLTE analysis of millions of spectra. The approach promises practical impact for upcoming large surveys by enabling rapid, self-consistent, NLTE-informed stellar characterization and will be extended to more elements and dynamic S/N regimes, with plans for public release.

Abstract

Modern spectroscopic surveys obtain spectra for millions of stars. However, classical spectroscopic methods can often be computationally expensive, rendering them impractical for the analysis of large datasets. We introduce a novel simulation-based deep-learning approach for the efficient analysis of high-resolution stellar spectra to be obtained with the upcoming high-resolution 4MOST spectrograph. We used a suite of synthetic non-local thermodynamic equilibrium (NLTE) spectra generated with Turbospectrum to mimic 4MOST observations and trained a conditional invertible neural network (cINN) for the purpose of predicting self-consistently stellar surface parameters and chemical abundances. The cINN is a neural network architecture that estimates full posterior distributions for the target stellar properties, providing an intrinsic uncertainty estimate. We evaluated the predictive performance of the trained cINN model on both synthetic data and observed spectra of stars. We found that our new cINN trained on NLTE synthetic spectra is capable of recovering stellar parameters with average errors ($σ$) of $33$ K for $T_\mathrm{eff}$, $0.16$ dex for $\log(g)$, and $0.12$ dex for [Fe/H], $0.1$ dex for [Ca/Fe], $0.11$ for [Mg/Fe], and $0.51$ dex for [Li/Fe], respectively, at a signal to noise ratio of 250 per Angstrom. From the analysis of the observed spectra of Gaia-ESO / 4MOST / PLATO benchmark stars, we verified that our NLTE estimates for stellar parameters and abundances are consistent with results obtained with the independent code TSFitPy. We conclude that the NLTE cINN is robust and can, theoretically, evaluate 4 million high-resolution 4MOST spectra in less than a day, using GPU acceleration.

A method to derive self-consistent NLTE astrophysical parameters for 4 million high-resolution 4MOST stellar spectra in half a day with invertible neural networks

TL;DR

Abstract

) of

K for

dex for

, and

dex for [Fe/H],

dex for [Ca/Fe],

for [Mg/Fe], and

dex for [Li/Fe], respectively, at a signal to noise ratio of 250 per Angstrom. From the analysis of the observed spectra of Gaia-ESO / 4MOST / PLATO benchmark stars, we verified that our NLTE estimates for stellar parameters and abundances are consistent with results obtained with the independent code TSFitPy. We conclude that the NLTE cINN is robust and can, theoretically, evaluate 4 million high-resolution 4MOST spectra in less than a day, using GPU acceleration.

Paper Structure (20 sections, 12 equations, 14 figures, 7 tables)

This paper contains 20 sections, 12 equations, 14 figures, 7 tables.

Introduction
Data
Training data
Benchmark data
Method
The conditional invertible neural network
Architecture and implementation details
Training setup
Hyperparameter search
Point estimates and sample rejection
INNs vs. other generative approaches
Results
Network efficiency
Tests on synthetic spectra
Test on benchmark spectra
...and 5 more sections

Figures (14)

Figure 1: Prior distributions of the target parameters in the training data. The panels in the top two rows show 2D histograms of the correlation between selected stellar properties. The panels in the bottom three rows show 1D histograms of all target parameters.
Figure 2: Schematic overview of the cINN architecture. We note that the bottom zoom-in only highlights the forward pass of a GLOW coupling layer, following Eq. \ref{['eq:coupling_forward']}, but not the backward pass (Eq. \ref{['eq:coupling_backward']}). Each of the 16 GLOW coupling blocks in our architecture employs two sub networks, following the layout shown in the bottom right zoom-in.
Figure 3: Runtime comparison between the proposed cINN approach to the previously established methods TSFitPy and The Payne. The performance of the cINN was measured for the case of generating 4096 posteriors samples per spectrum, using one core of an AMD EPYC 9554P (3.7 GHz) CPU and an NVIDIA RTX 6000 Ada GPU, respectively. The total cINN runtime is a combination of posterior generation and subsequent point estimation, but the latter becomes negligible quickly. We note that the runtimes in the right panel are extrapolated from the 1000 spectra (cINN) and 1 spectrum (TSFitPy and Payne) results, respectively. The Payne shown here operates on a more limited input parameter space, denoted as HR10, which covers the wavelength range from $534-562$ nm and is based on the setup of the GIRAFFE spectrograph.
Figure 4: Summary of cINN performance on synthetic spectra as a function of S/N. In each panel, the mean residual $\overline{\Delta}_\mathrm{cINN-GT}$ is plotted in black on the left y-axis scale, while the standard deviation $\sigma_\mathrm{cINN-GT}$ is indicated in grey and plotted on the right y-axis scale. The black dotted line indicates, where $\overline{\Delta}_\mathrm{cINN-GT} = 0$ for reference. For more details see Table \ref{['tab:me_summary']} and Fig. \ref{['fig:Synth_ME_SIGMA_vs_SNR_DETAILED']} in the Appendix.
Figure 5: 2D histograms that compare the cINN MAP estimates to the corresponding ground truth for 5910 synthetic spectra with $\mathrm{S/N} = 250$ held-out in the test set. The dotted black guide line indicates a perfect one-to-one match.
...and 9 more figures

A method to derive self-consistent NLTE astrophysical parameters for 4 million high-resolution 4MOST stellar spectra in half a day with invertible neural networks

TL;DR

Abstract

A method to derive self-consistent NLTE astrophysical parameters for 4 million high-resolution 4MOST stellar spectra in half a day with invertible neural networks

Authors

TL;DR

Abstract

Table of Contents

Figures (14)