Table of Contents
Fetching ...

Self-supervised Synthetic Pretraining for Inference of Stellar Mass Embedded in Dense Gas

Keiya Hirashima, Shingo Nozaki, Naoto Harada

TL;DR

This work tackles the difficulty of estimating stellar masses in deeply embedded star-forming regions by leveraging self-supervised learning. A Vision Transformer is pretrained on 1M synthetic fractal images with DINOv2 and then kept frozen to evaluate on limited high-resolution 3D MHD simulations, enabling zero-shot mass inference via $k$-NN regression and revealing semantically meaningful structures through PCA-based visualization. The results show that synthetic pretraining yields competitive or better performance than a fully supervised baseline in data-scarce regimes and enables unsupervised segmentation of star-forming regions without labels or fine-tuning. The approach promises data-efficient pathways to connect environmental gas properties to stellar masses and, potentially, IMF formation, though practical deployment to observations will require handling noise and ensuring applicability beyond simulated labels.

Abstract

Stellar mass is a fundamental quantity that determines the properties and evolution of stars. However, estimating stellar masses in star-forming regions is challenging because young stars are obscured by dense gas and the regions are highly inhomogeneous, making spherical dynamical estimates unreliable. Supervised machine learning could link such complex structures to stellar mass, but it requires large, high-quality labeled datasets from high-resolution magneto-hydrodynamical (MHD) simulations, which are computationally expensive. We address this by pretraining a vision transformer on one million synthetic fractal images using the self-supervised framework DINOv2, and then applying the frozen model to limited high-resolution MHD simulations. Our results demonstrate that synthetic pretraining improves frozen-feature regression stellar mass predictions, with the pretrained model performing slightly better than a supervised model trained on the same limited simulations. Principal component analysis of the extracted features further reveals semantically meaningful structures, suggesting that the model enables unsupervised segmentation of star-forming regions without the need for labeled data or fine-tuning.

Self-supervised Synthetic Pretraining for Inference of Stellar Mass Embedded in Dense Gas

TL;DR

This work tackles the difficulty of estimating stellar masses in deeply embedded star-forming regions by leveraging self-supervised learning. A Vision Transformer is pretrained on 1M synthetic fractal images with DINOv2 and then kept frozen to evaluate on limited high-resolution 3D MHD simulations, enabling zero-shot mass inference via -NN regression and revealing semantically meaningful structures through PCA-based visualization. The results show that synthetic pretraining yields competitive or better performance than a fully supervised baseline in data-scarce regimes and enables unsupervised segmentation of star-forming regions without labels or fine-tuning. The approach promises data-efficient pathways to connect environmental gas properties to stellar masses and, potentially, IMF formation, though practical deployment to observations will require handling noise and ensuring applicability beyond simulated labels.

Abstract

Stellar mass is a fundamental quantity that determines the properties and evolution of stars. However, estimating stellar masses in star-forming regions is challenging because young stars are obscured by dense gas and the regions are highly inhomogeneous, making spherical dynamical estimates unreliable. Supervised machine learning could link such complex structures to stellar mass, but it requires large, high-quality labeled datasets from high-resolution magneto-hydrodynamical (MHD) simulations, which are computationally expensive. We address this by pretraining a vision transformer on one million synthetic fractal images using the self-supervised framework DINOv2, and then applying the frozen model to limited high-resolution MHD simulations. Our results demonstrate that synthetic pretraining improves frozen-feature regression stellar mass predictions, with the pretrained model performing slightly better than a supervised model trained on the same limited simulations. Principal component analysis of the extracted features further reveals semantically meaningful structures, suggesting that the model enables unsupervised segmentation of star-forming regions without the need for labeled data or fine-tuning.

Paper Structure

This paper contains 16 sections, 1 equation, 3 figures, 1 table.

Figures (3)

  • Figure 1: Overview of our model. Left: self-supervised pretraining with synthetic fractal images using DINOv2 to extract feature vectors. Right: zero-shot evaluation on simulation with the frozen encoder, applied to stellar mass prediction ($k$-NN) and semantic segmentation (PCA-based colors).
  • Figure 2: Frozen-feature regression of stellar masses. (a) PCA projection of feature vectors from DINOv2 colored by stellar mass. (b) True versus predicted stellar masses using DINOv2 representations with $k$-NN regression. (c)True versus predicted stellar masses from a supervised ResNet-18 baseline.
  • Figure 3: Snapshots from MHD simulations with visualizations of PCA components of feature vectors. Each panel shows four maps: column density $N_\mathrm{HI}$, mean line-of-sight velocity $v_\mathrm{los}$, its velocity dispersion $\sigma_v$, and a color map of the first three PCA components from image patches.