ATCAT: Astronomical Timeseries CAusal Transformer

Zora Tung

ATCAT: Astronomical Timeseries CAusal Transformer

Zora Tung

TL;DR

ATCAT introduces a lightweight, transformer-based time-series classifier tailored to LSST-like light curves, delivering state-of-the-art accuracy on ELAsTiCC with LC-only and LC+metadata inputs. It advances light-curve encoding, metadata integration, and local-attention transformers, while enabling unsupervised pretraining, early detection, and calibrated outputs. The method achieves strong performance even with limited labels, and offers substantial throughput improvements suitable for large-scale surveys, with practical implications for follow-up prioritization and anomaly detection. The work also provides guidelines for data standardization, calibration, and futureGenerative opportunities, laying groundwork for cross-survey applicability and scalable time-domain classification.

Abstract

The Legacy Survey of Space and Time (LSST) at the Vera C. Rubin Observatory will capture light curves (LCs) for 10 billion sources and produce millions of transient candidates per night, necessitating scalable, accurate, and efficient classification. To prepare the community for this scale of data, the Extended LSST Astronomical Time-Series Classification Challenge (ELAsTiCC) sought to simulate a diversity of LSST-like time-domain events. Using a small transformer-based model and refined light curve encoding logic, we present a new state of the art classification performance on ELAsTiCC, with 71.8% F1 on LC-only classifications, and 89.8% F1 on LC+metadata classifications. Previous state of the art was 65.5% F1 for LC-only, and for LC+metadata, 84% F1 with a different setup and 83.5% F1 with a directly comparable setup. Our model outperforms previous state-of-the-art models for fine-grained early detection at all time cutoffs, which should help prioritize candidate transients for follow-up observations. We demonstrate label-efficient training by removing labels from 90% of the training data (chosen uniformly at random), and compensate by leveraging regularization, bootstrap ensembling, and unsupervised pretraining. Even with only 10% of the labeled data, we achieve 67.4% F1 on LC-only and 87.1% F1 on LC+metadata, validating an approach that should help mitigate synthetic and observational data drift, and improve classification on tasks with less labeled data. We find that our base model is poorly calibrated via reliability diagrams, and correct it at a minimal cost to overall performance, enabling selections by classification precision. Finally, our GPU-optimized implementation is 9x faster than other state-of-the-art ELAsTiCC models, and can run inference at ~33000 LCs/s on a consumer-grade GPU, making it suitable for large-scale applications, and less expensive to train.

ATCAT: Astronomical Timeseries CAusal Transformer

TL;DR

Abstract

ATCAT: Astronomical Timeseries CAusal Transformer

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)