Table of Contents
Fetching ...

Refined classification of YSOs and AGB stars by IR magnitudes, colors, and time-domain analysis with machine learning

Hyunwook Jheonn, Jeong-Eun Lee, Jinho Lee, Seonjae Lee, Hyeyoon Lee, ShinGeon Kim, Carlos Contreras Peña, Mi-Ryang Kim

TL;DR

This work tackles the persistent confusion between YSOs and AGB stars in IR photometry by introducing a two-stage Double Filter model that combines static photometric features with NEOWISE time-series data. Filter 1 uses an ensemble of SVM, RF, and MLP on 28 magnitudes/colors, while Filter 2 leverages an MLP on 20-epoch light-curve features including $W1$, $W2$, $W1-W2$, $\Delta W1/\sigma$, $\Delta W2/\sigma$, period, and $fAMP$, with data augmentation in the overlapped region. The model achieves a test accuracy of about $99.3\%$, with independent validation on Taurus YSOs and spectroscopically confirmed AGBs confirming substantial improvements over photometry-only approaches. When applied to the SPICY catalog, the method refined classifications, identifying 27,235 confirmed YSOs and 258 plausible AGB contaminants, many with periodic light curves; this demonstrates the practical utility of time-domain information for large IR surveys. Overall, the study provides a scalable, data-driven path to cleaner YSO catalogs and better understanding of AGB contamination in Galactic midplane surveys, aided by interpretability insights from Grad-CAM analyses of feature importance.

Abstract

We introduce a binary classification model, {\it the Double Filter Model}, utilizing various machine learning and deep learning methods to classify Young Stellar Objects (YSOs) and Asymptotic Giant Branch (AGB) stars. Since YSOs and AGB stars share similar infrared (IR) photometric characteristics due to comparable temperatures and the presence of circumstellar dust, distinguishing them is challenging and often leads to misclassification. While machine learning and deep learning techniques have helped reduce YSO-AGB misclassifications, achieving a reliable separation remains challenging. Given that YSOs and AGB stars exhibit distinct light curves resulting from different variability mechanisms, our Double Filter Model leverages light curve data to enhance classification accuracy. This approach uncovered YSOs and AGB stars that were misclassified in IR photometry and was validated against Taurus YSOs and spectroscopically confirmed AGB stars. We applied the model to the {\it Spitzer/IRAC Candidate YSO Catalog for the Inner Galactic Midplane} (SPICY) catalog for catalog refinement and identified potential AGB star contaminants.

Refined classification of YSOs and AGB stars by IR magnitudes, colors, and time-domain analysis with machine learning

TL;DR

This work tackles the persistent confusion between YSOs and AGB stars in IR photometry by introducing a two-stage Double Filter model that combines static photometric features with NEOWISE time-series data. Filter 1 uses an ensemble of SVM, RF, and MLP on 28 magnitudes/colors, while Filter 2 leverages an MLP on 20-epoch light-curve features including , , , , , period, and , with data augmentation in the overlapped region. The model achieves a test accuracy of about , with independent validation on Taurus YSOs and spectroscopically confirmed AGBs confirming substantial improvements over photometry-only approaches. When applied to the SPICY catalog, the method refined classifications, identifying 27,235 confirmed YSOs and 258 plausible AGB contaminants, many with periodic light curves; this demonstrates the practical utility of time-domain information for large IR surveys. Overall, the study provides a scalable, data-driven path to cleaner YSO catalogs and better understanding of AGB contamination in Galactic midplane surveys, aided by interpretability insights from Grad-CAM analyses of feature importance.

Abstract

We introduce a binary classification model, {\it the Double Filter Model}, utilizing various machine learning and deep learning methods to classify Young Stellar Objects (YSOs) and Asymptotic Giant Branch (AGB) stars. Since YSOs and AGB stars share similar infrared (IR) photometric characteristics due to comparable temperatures and the presence of circumstellar dust, distinguishing them is challenging and often leads to misclassification. While machine learning and deep learning techniques have helped reduce YSO-AGB misclassifications, achieving a reliable separation remains challenging. Given that YSOs and AGB stars exhibit distinct light curves resulting from different variability mechanisms, our Double Filter Model leverages light curve data to enhance classification accuracy. This approach uncovered YSOs and AGB stars that were misclassified in IR photometry and was validated against Taurus YSOs and spectroscopically confirmed AGB stars. We applied the model to the {\it Spitzer/IRAC Candidate YSO Catalog for the Inner Galactic Midplane} (SPICY) catalog for catalog refinement and identified potential AGB star contaminants.

Paper Structure

This paper contains 22 sections, 5 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: A visualized structure of the "Double Filter classifier" model. Blue boxes represent the data or catalogs used for training and application, while gray boxes denote the ML methods. The voting ensemble in Filter 1 combines classifications into a single final result using a soft voting process. Data Augmentation in Filter 2 is an approach to increase the amount of data to mitigate overfitting caused by a lack of training data.
  • Figure 2: The P21 YSO ( left) and S21 AGB star ( right) distribution on W1-W2 vs. W1 CMD. The black-dashed lines in both graphs represent the division criterion for YSOs and AGB stars, as defined by 2014KL. The gray-shaded regions are the overlapped regions defined by two empirically drawn lines (green and blue lines) among P21 and S21 catalogs based on the black dashed line.
  • Figure 3: The confusion matrices of the test results for the three methods (SVM, RF, MLP (Photometry)) in Filter 1 and NEOWISE trained MLP, which is the base component of Filter 2.
  • Figure 4: The confusion matrices of the Filter 1 and Filter 2 test results. Test data in Filter 2 are the misclassified YSOs (36) and AGBs (19) from Filter 1.
  • Figure 5: Left: The W1-W2 vs. W1 CMD of 115 Taurus YSOs. Red stars are confirmed YSOs by Filter 1, and orange stars are eight reclassified YSOs by Filter 2. Right: The same CMD for 85 validation AGB stars. Blue triangles are the AGB stars confirmed by Filter 1, yellow triangles are reclassified AGB stars by Filter 2, and red stars are the remaining misclassified AGB stars. The dashed lines are those in Figure \ref{['fig0A']}.
  • ...and 6 more figures