Table of Contents
Fetching ...

Supernova scores for active anomaly detection

Semenikhin T. A., Kornilov M. V., Pruzhinskaya M. V., Krushinsky V. V., Malanchev K. L., Dodin A.

TL;DR

Application of the combined methodology resulted in the discovery of seven previously unreported SN candidates, one AGN candidate, one unusual Galactic variable star SNAD283, as well as two host galaxies exhibiting multiple supernova events.

Abstract

Large time-domain sky surveys generate extensive multi-year catalogs of light curves in which scientifically valuable transients, such as supernovae (SNe), are vastly outnumbered by artifacts and routine star variability. While supervised machine learning models can efficiently filter known classes, they struggle with extreme class imbalance and may overlook rare or novel events. Conversely, unsupervised anomaly detection provides broad discovery potential but lacks targeted sensitivity. We present a hybrid strategy that integrates a supervised SN probability score (SN-score) into the PineForest active anomaly detection framework to enhance SN discovery rate in the 23rd data release of the Zwicky Transient Facility. We train a binary classifier using light-curve features of spectroscopically confirmed SNe from the ZTF Bright Transient Survey, achieving a ROC-AUC approximately 0.98. Incorporating the SN-score as an additional feature, together with a small set of labeled priors, significantly accelerates the discovery of SN-like transients across ten extragalactic ZTF fields. This method increases discovery efficiency without compromising the ability to identify diverse astrophysical anomalies. Application of the combined methodology resulted in the discovery of seven previously unreported SN candidates, one AGN candidate, one unusual Galactic variable star SNAD283, as well as two host galaxies exhibiting multiple supernova events. These results demonstrate its value for scalable and expert-guided transient search in current and future surveys, including the Vera C. Rubin Observatory Legacy Survey of Space and Time.

Supernova scores for active anomaly detection

TL;DR

Application of the combined methodology resulted in the discovery of seven previously unreported SN candidates, one AGN candidate, one unusual Galactic variable star SNAD283, as well as two host galaxies exhibiting multiple supernova events.

Abstract

Large time-domain sky surveys generate extensive multi-year catalogs of light curves in which scientifically valuable transients, such as supernovae (SNe), are vastly outnumbered by artifacts and routine star variability. While supervised machine learning models can efficiently filter known classes, they struggle with extreme class imbalance and may overlook rare or novel events. Conversely, unsupervised anomaly detection provides broad discovery potential but lacks targeted sensitivity. We present a hybrid strategy that integrates a supervised SN probability score (SN-score) into the PineForest active anomaly detection framework to enhance SN discovery rate in the 23rd data release of the Zwicky Transient Facility. We train a binary classifier using light-curve features of spectroscopically confirmed SNe from the ZTF Bright Transient Survey, achieving a ROC-AUC approximately 0.98. Incorporating the SN-score as an additional feature, together with a small set of labeled priors, significantly accelerates the discovery of SN-like transients across ten extragalactic ZTF fields. This method increases discovery efficiency without compromising the ability to identify diverse astrophysical anomalies. Application of the combined methodology resulted in the discovery of seven previously unreported SN candidates, one AGN candidate, one unusual Galactic variable star SNAD283, as well as two host galaxies exhibiting multiple supernova events. These results demonstrate its value for scalable and expert-guided transient search in current and future surveys, including the Vera C. Rubin Observatory Legacy Survey of Space and Time.
Paper Structure (22 sections, 9 figures, 4 tables)

This paper contains 22 sections, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Sky maps illustrating the selection of ZTF fields used in this study. Top: Distribution of object counts from the main dataset across ZTF fields (numbers within rectangles indicate the corresponding Field ID); blue rectangles mark the ten selected fields. Bottom: The same ten ZTF fields shown in relation to the sky coverage of other big surveys.
  • Figure 2: Mean ROC curve across stratified k-fold validation folds, with the shaded band indicating $\pm \sigma$ variability.
  • Figure 3: ZTF DR23 multicolor light curves of new supernova candidates with SNCOSMO best-model fits and fitted model parameters.
  • Figure 4: The spectrum of SNAD283 obtained with the 2.5-meter telescope of the Caucasus Mountain Observatory. Blue markers indicate the positions of HeI double-peaked emission lines.
  • Figure 5: ZTF DR23 multicolor light curve of SNAD283 generated with the SNAD ZTF Viewer 2023PASP..135b4503M.
  • ...and 4 more figures