Leveraging pre-trained vision Transformers for multi-band photometric light curve classification
Daniel Moreno-Cartagena, Pavlos Protopapas, Guillermo Cabrera-Vives, Martina Cádiz-Leyton, Ignacio Becker, Cristóbal Donoso-Oliva
TL;DR
This work investigates whether a pre-trained vision Transformer (SwinV2) can classify multi-band photometric light curves without explicit feature extraction. Light curves are converted into RGB images using Grid and Overlay schemes and then fine-tuned on MACHO and ELAsTiCC datasets. The SwinV2 model demonstrates competitive performance, often outperforming light-curve–specific models, with notable gains from multi-band inputs and strong results on large, six-band datasets. The study suggests a scalable, generalizable framework for time-domain astronomy that could be extended with metadata and multi-modal learning for future surveys like LSST.
Abstract
This study investigates the potential of a pre-trained vision Transformer (VT) model, specifically the Swin Transformer V2 (SwinV2), to classify photometric light curves without the need for feature extraction or multi-band preprocessing. The goal is to assess whether this image-based approach can accurately differentiate astronomical phenomena and serve as a viable option for working with multi-band photometric light curves. We transformed each multi-band light curve into an image. These images serve as input to the SwinV2 model, which is pre-trained on ImageNet-21K. The datasets employed include the public Catalog of Variable Stars from the Massive Compact Halo Object (MACHO) survey, using both one and two bands, and the first round of the recent Extended LSST Astronomical Time-Series Classification Challenge (ELAsTiCC), which includes six bands. The performance of the model was evaluated on six classes for the MACHO dataset and 20 distinct classes of variable stars and transient events for the ELAsTiCC dataset. The fine-tuned SwinV2 achieved better performance than models specifically designed for light curves, such as Astromer and the Astronomical Transformer for Time Series and Tabular Data (ATAT). When trained on the full MACHO dataset, it attained a macro F1-score of 80.2 and outperformed Astromer in single-band experiments. Incorporating a second band further improved performance, increasing the F1-score to 84.1. In the ELAsTiCC dataset, SwinV2 achieved a macro F1-score of 65.5, slightly surpassing ATAT by 1.3.
