Table of Contents
Fetching ...

SM-Net: Learning a Continuous Spectral Manifold from Multiple Stellar Libraries

Omar Anwar, Aaron S. G. Robotham, Luca Cortese, Kevin Vinsen

Abstract

We present SM-Net, a machine-learning model that learns a continuous spectral manifold from multiple high-resolution stellar libraries. SM-Net generates stellar spectra directly from the fundamental stellar parameters effective temperature (Teff), surface gravity (log g), and metallicity (log Z). It is trained on a combined grid derived from the PHOENIX-Husser, C3K-Conroy, OB-PoWR, and TMAP-Werner libraries. By combining their parameter spaces, we construct a composite dataset that spans a broader and more continuous region of stellar parameter space than any individual library. The unified grid covers Teff = 2,000-190,000 K, log g = -1 to 9, and log Z = -4 to 1, with spectra spanning 3,000-100,000 Angstrom. Within this domain, SM-Net provides smooth interpolation across heterogeneous library boundaries. Outside the sampled region, it can produce numerically smooth exploratory predictions, although these extrapolations are not directly validated against reference models. Zero or masked flux values are treated as unknowns rather than physical zeros, allowing the network to infer missing regions using correlations learned from neighbouring grid points. Across 3,538 training and 11,530 test spectra, SM-Net achieves mean squared errors of 1.47 x 10^-5 on the training set and 2.34 x 10^-5 on the test set in the transformed log1p-scaled flux representation. Inference throughput exceeds 14,000 spectra per second on a single GPU. We also release the model together with an interactive web dashboard for real-time spectral generation and visualisation. SM-Net provides a fast, robust, and flexible data-driven complement to traditional stellar population synthesis libraries.

SM-Net: Learning a Continuous Spectral Manifold from Multiple Stellar Libraries

Abstract

We present SM-Net, a machine-learning model that learns a continuous spectral manifold from multiple high-resolution stellar libraries. SM-Net generates stellar spectra directly from the fundamental stellar parameters effective temperature (Teff), surface gravity (log g), and metallicity (log Z). It is trained on a combined grid derived from the PHOENIX-Husser, C3K-Conroy, OB-PoWR, and TMAP-Werner libraries. By combining their parameter spaces, we construct a composite dataset that spans a broader and more continuous region of stellar parameter space than any individual library. The unified grid covers Teff = 2,000-190,000 K, log g = -1 to 9, and log Z = -4 to 1, with spectra spanning 3,000-100,000 Angstrom. Within this domain, SM-Net provides smooth interpolation across heterogeneous library boundaries. Outside the sampled region, it can produce numerically smooth exploratory predictions, although these extrapolations are not directly validated against reference models. Zero or masked flux values are treated as unknowns rather than physical zeros, allowing the network to infer missing regions using correlations learned from neighbouring grid points. Across 3,538 training and 11,530 test spectra, SM-Net achieves mean squared errors of 1.47 x 10^-5 on the training set and 2.34 x 10^-5 on the test set in the transformed log1p-scaled flux representation. Inference throughput exceeds 14,000 spectra per second on a single GPU. We also release the model together with an interactive web dashboard for real-time spectral generation and visualisation. SM-Net provides a fast, robust, and flexible data-driven complement to traditional stellar population synthesis libraries.
Paper Structure (15 sections, 14 equations, 24 figures, 4 tables)

This paper contains 15 sections, 14 equations, 24 figures, 4 tables.

Figures (24)

  • Figure 1: Top-level schematic of the SM-Net model architecture. The four stellar spectral libraries PHOENIX--Husser, C3K--Conroy, OB--PoWR, and TMAP--Werner are first combined into a unified multi-library dataset spanning a broad region of stellar parameter space. SM-Net is then trained on this dataset to generate spectra in two stages: a global parameter-to-coarse-spectrum prediction, followed by a convolutional refinement to produce the final high-fidelity output. The resulting model is accessed through an interactive dashboard that enables real-time spectral generation and exploration.
  • Figure 2: A 3D plot representing the coverage of each of the four libraries PHOENIX-Husser, C3K, OB-PoWR, and TMAP in $(\log_{10} T_{\mathrm{eff}}, \log g, \log Z)$ space. The overlapping region between C3K and Husser spans $3.36 \le \log T_{\mathrm{eff}} (K) \le 4.08$, $0 \le \log g \le 5.5$, and $-2.1 \le \log Z \le 0.5$. The overlapping region between C3K and TMAP covers $4.30 \le \log T_{\mathrm{eff}} (K) \le 4.70$, $5 \le \log g \le 5.5$, and $-2.1 \le \log Z \le 0.5$. The overlapping region between C3K and OB-PoWR covers $4.18 \le \log T_{\mathrm{eff}} (K) \le 4.70$, $2 \le \log g \le 4.5$, and $-2.1 \le \log Z \le 0.5$.
  • Figure 3: Library coverage and overlap on $\log T_{\mathrm{eff}} (K)$ -- $\log g$, $\log T_{\mathrm{eff}} (K)$ -- $\log Z$ and $\log g$ -- $\log Z$ grids.
  • Figure 4: Distribution of zero/unknown flux values across the dataset of combined grids. Counts are aggregated over logZ values.
  • Figure 5: The replication counts for the entire dataset based on spectral dissimilarity, aggregated over $\log Z$.
  • ...and 19 more figures