Table of Contents
Fetching ...

deep-REMAP: Probabilistic Parameterization of Stellar Spectra Using Regularized Multi-Task Learning

Sankalp Gilda

TL;DR

deep-REMAP addresses scalable stellar parameterization from large spectroscopic surveys by combining transfer learning, multi-task learning, and probabilistic classification with an embedding regularization. It trains on synthetic PHOENIX spectra and fine-tunes with MARVELS observations to bridge the synthetic-observational gap, enabling parameter estimation for 732 FGK giant candidates. The regression-to-classification framework, augmented by an embedding loss, captures non-Gaussian uncertainties and yields an interpretable embedding space for nearest-neighbor retrieval; validation on 30 MARVELS calibration stars achieves $T_eff$ precision of about $75$ K and small biases in $log g$ and [Fe/H]. The approach generalizes to other surveys and libraries, offering an automated, robust pathway for stellar characterization at scale.

Abstract

In the era of exploding survey volumes, traditional methods of spectroscopic analysis are being pushed to their limits. In response, we develop deep-REMAP, a novel deep learning framework that utilizes a regularized, multi-task approach to predict stellar atmospheric parameters from observed spectra. We train a deep convolutional neural network on the PHOENIX synthetic spectral library and use transfer learning to fine-tune the model on a small subset of observed FGK dwarf spectra from the MARVELS survey. We then apply the model to 732 uncharacterized FGK giant candidates from the same survey. When validated on 30 MARVELS calibration stars, deep-REMAP accurately recovers the effective temperature ($T_{\rm{eff}}$), surface gravity ($\log \rm{g}$), and metallicity ([Fe/H]), achieving a precision of, for instance, approximately 75 K in $T_{\rm{eff}}$. By combining an asymmetric loss function with an embedding loss, our regression-as-classification framework is interpretable, robust to parameter imbalances, and capable of capturing non-Gaussian uncertainties. While developed for MARVELS, the deep-REMAP framework is extensible to other surveys and synthetic libraries, demonstrating a powerful and automated pathway for stellar characterization.

deep-REMAP: Probabilistic Parameterization of Stellar Spectra Using Regularized Multi-Task Learning

TL;DR

deep-REMAP addresses scalable stellar parameterization from large spectroscopic surveys by combining transfer learning, multi-task learning, and probabilistic classification with an embedding regularization. It trains on synthetic PHOENIX spectra and fine-tunes with MARVELS observations to bridge the synthetic-observational gap, enabling parameter estimation for 732 FGK giant candidates. The regression-to-classification framework, augmented by an embedding loss, captures non-Gaussian uncertainties and yields an interpretable embedding space for nearest-neighbor retrieval; validation on 30 MARVELS calibration stars achieves precision of about K and small biases in and [Fe/H]. The approach generalizes to other surveys and libraries, offering an automated, robust pathway for stellar characterization at scale.

Abstract

In the era of exploding survey volumes, traditional methods of spectroscopic analysis are being pushed to their limits. In response, we develop deep-REMAP, a novel deep learning framework that utilizes a regularized, multi-task approach to predict stellar atmospheric parameters from observed spectra. We train a deep convolutional neural network on the PHOENIX synthetic spectral library and use transfer learning to fine-tune the model on a small subset of observed FGK dwarf spectra from the MARVELS survey. We then apply the model to 732 uncharacterized FGK giant candidates from the same survey. When validated on 30 MARVELS calibration stars, deep-REMAP accurately recovers the effective temperature (), surface gravity (), and metallicity ([Fe/H]), achieving a precision of, for instance, approximately 75 K in . By combining an asymmetric loss function with an embedding loss, our regression-as-classification framework is interpretable, robust to parameter imbalances, and capable of capturing non-Gaussian uncertainties. While developed for MARVELS, the deep-REMAP framework is extensible to other surveys and synthetic libraries, demonstrating a powerful and automated pathway for stellar characterization.

Paper Structure

This paper contains 22 sections, 6 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Schematic of the $\rm{deep-REMAP}$ neural network architecture. Left: The structure of the two types of residual modules used, RN-Module V1 (identity) and V2 (projection for downsampling). Middle: The network backbone, consisting of a sequence of residual modules that progressively increase the feature depth (K) while reducing the spectral dimension (p). Right: The multi-task head structure. The shared feature backbone splits into three parallel heads, one for each stellar parameter ($T_{\rm{eff}}$, $\log \rm{g}$, [Fe/H]). Each head further splits into an embedding head (trained with triplet loss) to structure the latent space, and a classification head (trained with focal loss) to produce the final probabilistic prediction.
  • Figure 2: Visualization of the learned embedding space for the PHOENIX training set at Epoch 0, 25, and 50, projected using t-SNE. Columns from left to right correspond to classes for $T_{\rm{eff}}$, $\log \rm{g}$, and [Fe/H]. The clear formation of distinct clusters by Epoch 50 demonstrates that the network is successfully learning to structure the feature space.
  • Figure 3: Same as Figure \ref{['fig:tsne_train']}, but for the unseen PHOENIX validation set. The emergence of well-separated clusters in this hold-out sample confirms that the model is not overfitting and has learned generalizable features, validating the effectiveness of our training methodology.
  • Figure 4: Comparison of our $\rm{deep-REMAP}$ predicted parameters versus the known literature values for the 30 MARVELS calibration stars ghezzi2014accurate. From left to right: effective temperature ($T_{\rm{eff}}$), surface gravity ($\log \rm{g}$), and metallicity ([Fe/H]). The blue points show the median of our predicted probability distribution, with error bars representing the 16th and 84th percentiles. The dashed red line indicates the one-to-one correspondence. The tight clustering around this line demonstrates the high accuracy and precision of our model.