MALLORN: Many Artificial LSST Lightcurves based on Observations of Real Nuclear transients
Dylan Magill, Matt Nicholl, Vysakh Anilkumar, Sjoert van Velzen, Xinyue Sheng, Thai Son Mai, Hung Viet Tran, Ngoc Phu Doan, Thomas Moore, Shubham Srivastav, David R. Young, Charlotte R. Angus, Joshua Weston
TL;DR
MALLORN presents a data-driven pipeline to synthesize LSST-like lightcurves from real ZTF nuclear transients, enabling photometric classification of rare events such as tidal disruption events (TDEs) in the era of LSST. The workflow combines Gaussian Process interpolation, LSST-depth luminosity rescaling, SNCosmo-based color corrections across six bands, and Rubin cadence embedding to produce a large, labeled, six-band time-domain dataset suitable for classifier challenges. The authors release a Kaggle challenge with 10,178 lightcurves (ground-truth and test sets) to benchmark photometric TDE classifiers, and they analyze how LSST cadence and band choices affect TDE detectability using their simulated population. This work provides a replicable framework for generating survey-specific transient simulations and offers practical insights for early classifier development ahead of LSST operations.
Abstract
The Vera C. Rubin Observatory's 10-Year Legacy Survey of Space and Time (LSST) is expected to produce a hundredfold increase in the number of transients we observe. However, there are insufficient spectroscopic resources to follow up on all of the wealth of targets that LSST will provide. As such it is necessary to be able to prioritise objects for followup observations or inclusion in sample studies based purely on their LSST photometry. We are particularly keen to identify tidal disruption events (TDEs) with LSST. TDEs are immensely useful for determining black hole parameters and probing our understanding of accretion physics. To assist in these efforts, we present the Many Artificial LSST Lightcurves based on the Observations of Real Nuclear transients (MALLORN) data set and the corresponding classifier challenge for identifying TDEs. MALLORN comprises 10178 simulated LSST light curves, constructed from real Zwicky Transient Facility (ZTF) observations of 64 TDEs, 727 nuclear supernovae and 1407 AGN with spectroscopic labels using Gaussian process fitting, empirically-motivated spectral energy distributions from SNCosmo and the baseline from the Rubin Survey Simulator. Our novel approach can be easily adapted to simulate transients for any photometric survey using observations from another, requiring only the limiting magnitudes and an estimate of the cadence of observations. The MALLORN Astronomical Classification Challenge, launched on Kaggle on 15/10/2025, will allow competitors to test their photometric classifiers on simulated LSST data to find TDEs and improve upon their capabilities prior to the start of LSST.
