Table of Contents
Fetching ...

Compact and Physically Interpretable Feature Models for Photometric Type Ia Supernova Classification

Anurag Garg

Abstract

Photometric classification of Type Ia supernovae is essential for modern time-domain surveys, where spectroscopic confirmation is not always feasible for the full transient sample. In this work, we investigate a compact and physically interpretable feature representation derived from multi-band light curves and evaluate its performance using gradient-boosted decision trees on the Supernova Photometric Classification Challenge (SPCC) dataset. Starting from a reduced 16-feature model, we perform a systematic feature ablation study to determine which physical descriptors contribute most strongly to classification performance. The final compact model achieves an F1-score of approximately 0.844 and a precision--recall area under the curve (PR-AUC) of approximately 0.928. The ablation results show that temporal evolution provides the dominant classification signal, while brightness, color, and variability features supply complementary information. A reduced core of approximately ten physically meaningful features retains nearly the full performance of the compact model, indicating that reliable classification does not require large high-dimensional feature spaces. These results show that interpretable feature-based models can capture the essential astrophysical information needed for Type Ia photometric classification, with direct implications for survey cadence, filter coverage, and the design of transparent machine learning pipelines for future time-domain surveys.

Compact and Physically Interpretable Feature Models for Photometric Type Ia Supernova Classification

Abstract

Photometric classification of Type Ia supernovae is essential for modern time-domain surveys, where spectroscopic confirmation is not always feasible for the full transient sample. In this work, we investigate a compact and physically interpretable feature representation derived from multi-band light curves and evaluate its performance using gradient-boosted decision trees on the Supernova Photometric Classification Challenge (SPCC) dataset. Starting from a reduced 16-feature model, we perform a systematic feature ablation study to determine which physical descriptors contribute most strongly to classification performance. The final compact model achieves an F1-score of approximately 0.844 and a precision--recall area under the curve (PR-AUC) of approximately 0.928. The ablation results show that temporal evolution provides the dominant classification signal, while brightness, color, and variability features supply complementary information. A reduced core of approximately ten physically meaningful features retains nearly the full performance of the compact model, indicating that reliable classification does not require large high-dimensional feature spaces. These results show that interpretable feature-based models can capture the essential astrophysical information needed for Type Ia photometric classification, with direct implications for survey cadence, filter coverage, and the design of transparent machine learning pipelines for future time-domain surveys.
Paper Structure (26 sections, 1 equation, 5 figures, 1 table)

This paper contains 26 sections, 1 equation, 5 figures, 1 table.

Figures (5)

  • Figure 1: SHAP summary plot for the compact feature model. Temporal features and red-band flux statistics provide the largest contribution to the classification decision, indicating that the multi-band time evolution of the light curve is the dominant source of discriminating information.
  • Figure 2: Precision--recall curve for the compact feature model. The model maintains high precision over a wide range of recall, indicating reliable identification of Type Ia supernovae.
  • Figure 3: Change in F1-score when groups of physically related features are removed. Temporal features cause the largest performance drop, followed by brightness, color, and variability. This indicates that the time evolution of the light curve is the dominant physical signature used by the classifier.
  • Figure 4: Classification performance as progressively larger subsets of physically motivated features are included. Brightness alone provides a first-order separation between classes, while the largest improvement occurs when temporal descriptors are added, indicating that multi-band time evolution provides the dominant discriminating information.
  • Figure 5: Trade-off between the number of features and classification performance. A reduced feature set containing only the most important variables retains nearly the full performance of the compact model, indicating that photometric classification depends on a small number of physically meaningful descriptors.