SELDON: Supernova Explosions Learned by Deep ODE Networks

Jiezhong Wu; Jack O'Brien; Jennifer Li; M. S. Krafczyk; Ved G. Shah; Amanda R. Wasserman; Daniel W. Apley; Gautham Narayan; Noelle I. Samia

SELDON: Supernova Explosions Learned by Deep ODE Networks

Jiezhong Wu, Jack O'Brien, Jennifer Li, M. S. Krafczyk, Ved G. Shah, Amanda R. Wasserman, Daniel W. Apley, Gautham Narayan, Noelle I. Samia

TL;DR

SELDON is a new continuous-time variational autoencoder for panels of sparse and irregularly time-sampled astrophysical light curves that are nonstationary, heteroscedastic, and inherently dependent, and offers a generic recipe for interpretable and continuous-time sequence modeling in any time domain.

Abstract

The discovery rate of optical transients will explode to 10 million public alerts per night once the Vera C. Rubin Observatory's Legacy Survey of Space and Time comes online, overwhelming the traditional physics-based inference pipelines. A continuous-time forecasting AI model is of interest because it can deliver millisecond-scale inference for thousands of objects per day, whereas legacy MCMC codes need hours per object. In this paper, we propose SELDON, a new continuous-time variational autoencoder for panels of sparse and irregularly time-sampled (gappy) astrophysical light curves that are nonstationary, heteroscedastic, and inherently dependent. SELDON combines a masked GRU-ODE encoder with a latent neural ODE propagator and an interpretable Gaussian-basis decoder. The encoder learns to summarize panels of imbalanced and correlated data even when only a handful of points are observed. The neural ODE then integrates this hidden state forward in continuous time, extrapolating to future unseen epochs. This extrapolated time series is further encoded by deep sets to a latent distribution that is decoded to a weighted sum of Gaussian basis functions, the parameters of which are physically meaningful. Such parameters (e.g., rise time, decay rate, peak flux) directly drive downstream prioritization of spectroscopic follow-up for astrophysical surveys. Beyond astronomy, the architecture of SELDON offers a generic recipe for interpretable and continuous-time sequence modeling in any time domain where data are multivariate, sparse, heteroscedastic, and irregularly spaced.

SELDON: Supernova Explosions Learned by Deep ODE Networks

TL;DR

Abstract

Paper Structure (21 sections, 14 equations, 4 figures, 2 tables)

This paper contains 21 sections, 14 equations, 4 figures, 2 tables.

Introduction
Methods
Data
Preprocessing
Augmentation
Architecture
Embedding
Encoder
Gated Recurrent Unit
Deep Sets
GRU-ODE Encoder with Deep Sets
Latent Neural-ODE Solver
Decoder
Learnable Inverse Transform Mapping
Loss
...and 6 more sections

Figures (4)

Figure 1: Architecture of our proposed SELDON, a customized VAE with band-aware GRU-ODE encoder and interpretable Gaussian-basis decoder. A light curve described by a series of flux observations in various filter bands is encoded to an initial hidden state with the GRU-ODE. The hidden state is evolved with the neural ODE forward in time to form a trajectory on a regularly-sampled grid. This trajectory is then interpreted by a Deep Sets layer to an approximate posterior latent vector. The latent vector is then decoded into a series of basis function parameters representing the history and future evolution of the light curve at all times in all filter bands.
Figure 2: An illustration of a light curve observed over time across six bands indicated in distinct colors, where the total number of observations is in the $99^\text{th}$ percentile of all light curves. The error bars for each observation represent the observed flux errors.
Figure 3: Out-of-sample forecasting performance as a function of the fraction of the light curve that has been observed. Left: mean absolute $Z$-score ($\mathrm{mean}\,|Z|$). Center: worst-case absolute $Z$-score ($\mathrm{max}\,|Z|$, log-scale). Right: normalised RMSE. Lower is better in all panels. SELDON (i.e., GRU-ODE) (green) consistently produces the lowest tail and aggregate errors. A plain masked-GRU (orange) has the best median at $10\%$ observed but is outperformed by SELDON afterward. Deep Sets (blue) shows competitive medians, yet the heaviest tails.
Figure 4: Violin plots of the signed standardized residuals for out-of-sample forecasts per fraction observed, in each of the three models. These residuals are clipped between $\pm 5$ for visual clarity.

SELDON: Supernova Explosions Learned by Deep ODE Networks

TL;DR

Abstract

SELDON: Supernova Explosions Learned by Deep ODE Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (4)