Classification of Transient Astronomical Object Light Curves Using LSTM Neural Networks
Guilherme Grancho D. Fernandes, Marco A. Barroca, Mateus dos Santos, Rafael S. Oliveira
TL;DR
The paper addresses automatic classification of transient astronomical light curves from the PLAsTiCC dataset using a bidirectional LSTM. After collapsing 14 original classes into five generalized categories and applying padding, temporal rescaling, and per-object flux normalization, the model achieves strong ROC AUC for S-Like (0.95) and Periodic (0.99) classes but weaker performance for Long (0.68) and Non-Periodic (0.40), with confusion between Periodic and Non-Periodic. Partial light-curve experiments show substantial degradation as data are limited to earlier detection times, highlighting challenges from class imbalance and limited temporal information. The work points toward potential remedies such as class balancing, detection-focused preprocessing, and exploring attention-based or transformer architectures to better capture long-range dependencies and distinguish periodic from non-periodic signals, impacting future time-series analyses in large astronomical surveys.
Abstract
This study presents a bidirectional Long Short-Term Memory (LSTM) neural network for classifying transient astronomical object light curves from the Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC) dataset. The original fourteen object classes were reorganized into five generalized categories (S-Like, Fast, Long, Periodic, and Non-Periodic) to address class imbalance. After preprocessing with padding, temporal rescaling, and flux normalization, a bidirectional LSTM network with masking layers was trained and evaluated on a test set of 19,920 objects. The model achieved strong performance for S-Like and Periodic classes, with ROC area under the curve (AUC) values of 0.95 and 0.99, and Precision-Recall AUC values of 0.98 and 0.89, respectively. However, performance was significantly lower for Fast and Long classes (ROC AUC of 0.68 for Long class), and the model exhibited difficulty distinguishing between Periodic and Non-Periodic objects. Evaluation on partial light curve data (5, 10,and 20 days from detection) revealed substantial performance degradation, with increased misclassification toward the S-Like class. These findings indicate that class imbalance and limited temporal information are primary limitations, suggesting that class balancing strategies and preprocessing techniques focusing on detection moments could improve performance.
