Table of Contents
Fetching ...

Machine-learning inference of stellar properties using integrated photometric and spectroscopic data

Ilay Kamai, Alex M. Bronstein, Hagai B. Perets

TL;DR

DESA introduces a multimodal foundation model that unifies photometric light curves and spectra into a physically informative stellar latent space by first training modality-specific encoders with a hybrid SSL/supervised objective, then aligning them via the DualFormer module. The DualFormer combines self- and cross-attention and employs a dual, projection-based alignment with a covariance regularizer, yielding an eigenspace that captures shared structure across modalities. Empirically, DESA achieves state-of-the-art performance on binary detection ($AUC = 0.99$, $AP = 1.00$) and stellar age prediction ($RMSE = 0.94$ Gyr), while zero-/few-shot evaluations recover CMD and HR diagrams with $R^2 = 0.92$ and enable meaningful population discovery (e.g., separating synchronized binaries from young stars in latent space). The work demonstrates that integrating heterogeneous surveys through a carefully designed multimodal architecture enables both improved predictive accuracy and new astrophysical insights, paving the way for population-level analyses and discovery in large stellar surveys.

Abstract

Stellar astrophysics relies on diverse observational modalities-primarily photometric light curves and spectroscopic data from which fundamental stellar properties are inferred. While machine learning (ML) has advanced analysis within individual modalities, the complementary information encoded across modalities remains largely underexploited. We present DESA (Dual Embedding model for Stellar Astrophysics), a novel multi-modal foundation model that integrates light curves and spectra to learn a unified, physically meaningful latent space for stars. DESA first trains separate modality-specific encoders using a hybrid supervised/self-supervised scheme, and then aligns them through DualFormer, a Transformer-based cross-modal integration module tailored for astrophysical data. DualFormer combines cross- and self-attention, a novel dual-projection alignment loss, and a projection-space eigendecomposition that yields physically structured embeddings. We demonstrate that DESA significantly outperforms leading unimodal and self-supervised baselines across a range of tasks. In zero- and few-shot settings, DESA's learned representations recover stellar color-magnitude and Hertzsprung-Russell diagrams with high fidelity ($R^2 = 0.92$ for photometric regressions). In full fine-tuning, DESA achieves state-of-the-art accuracy for binary star detection (AUC = $0.99$, AP = $1.00$) and stellar age prediction (RMSE = $0.94$ Gyr). As a compelling case, DESA naturally separates synchronized binaries from young stars, two populations with nearly identical light curves, purely from their embedded positions in UMAP space, without requiring external kinematic or luminosity information. DESA thus offers a powerful new framework for multimodal, data-driven stellar population analysis, enabling both accurate prediction and novel discovery.

Machine-learning inference of stellar properties using integrated photometric and spectroscopic data

TL;DR

DESA introduces a multimodal foundation model that unifies photometric light curves and spectra into a physically informative stellar latent space by first training modality-specific encoders with a hybrid SSL/supervised objective, then aligning them via the DualFormer module. The DualFormer combines self- and cross-attention and employs a dual, projection-based alignment with a covariance regularizer, yielding an eigenspace that captures shared structure across modalities. Empirically, DESA achieves state-of-the-art performance on binary detection (, ) and stellar age prediction ( Gyr), while zero-/few-shot evaluations recover CMD and HR diagrams with and enable meaningful population discovery (e.g., separating synchronized binaries from young stars in latent space). The work demonstrates that integrating heterogeneous surveys through a carefully designed multimodal architecture enables both improved predictive accuracy and new astrophysical insights, paving the way for population-level analyses and discovery in large stellar surveys.

Abstract

Stellar astrophysics relies on diverse observational modalities-primarily photometric light curves and spectroscopic data from which fundamental stellar properties are inferred. While machine learning (ML) has advanced analysis within individual modalities, the complementary information encoded across modalities remains largely underexploited. We present DESA (Dual Embedding model for Stellar Astrophysics), a novel multi-modal foundation model that integrates light curves and spectra to learn a unified, physically meaningful latent space for stars. DESA first trains separate modality-specific encoders using a hybrid supervised/self-supervised scheme, and then aligns them through DualFormer, a Transformer-based cross-modal integration module tailored for astrophysical data. DualFormer combines cross- and self-attention, a novel dual-projection alignment loss, and a projection-space eigendecomposition that yields physically structured embeddings. We demonstrate that DESA significantly outperforms leading unimodal and self-supervised baselines across a range of tasks. In zero- and few-shot settings, DESA's learned representations recover stellar color-magnitude and Hertzsprung-Russell diagrams with high fidelity ( for photometric regressions). In full fine-tuning, DESA achieves state-of-the-art accuracy for binary star detection (AUC = , AP = ) and stellar age prediction (RMSE = Gyr). As a compelling case, DESA naturally separates synchronized binaries from young stars, two populations with nearly identical light curves, purely from their embedded positions in UMAP space, without requiring external kinematic or luminosity information. DESA thus offers a powerful new framework for multimodal, data-driven stellar population analysis, enabling both accurate prediction and novel discovery.

Paper Structure

This paper contains 20 sections, 8 equations, 20 figures, 3 tables.

Figures (20)

  • Figure 1: Upper panel - High-level diagram of the entire model. Lower panels - Detailed diagrams of the DualFormer module, the spectra encoder, and the light curve encoder.
  • Figure 2: Ablation study results. Upper panel: different attention mechanisms. Lower panel: different uses of $A$ -- with and without $A^T$ for one of the projections (refer to Section \ref{['subsec:dualformer']} for details).
  • Figure 3: Example of the pre-processing steps for LAMOST spectra. The left column is the blue range, and the right column is the red range.
  • Figure 4: Example of the pre-processing steps for Kepler Light curve. The upper row shows the raw light curve normalized by absolute magnitude (left) and mean and standard deviation (right). The lower row shows the ACF (left) and FFT (right).
  • Figure 5: Upper panel - results of the spectra encoder on LASP labels. Lower panel - results on the labels from APOGEE. The purple and gray lines represent prediction intervals of $80\%$ and $50\%$.
  • ...and 15 more figures