Table of Contents
Fetching ...

HOLISMOKES XVIII: Detecting strongly lensed SNe Ia from time series of multi-band LSST-like imaging data

Satadru Bag, Raoul Canameras, Sherry H. Suyu, Stefan Schuldt, Stefan Taubenberger, Irham Taufik Andika, Alejandra Melo

TL;DR

This work tackles the challenge of detecting strongly lensed SNe Ia (LSNe Ia) from time-series, multi-band imaging to enable timely follow-up observations. It introduces a ConvLSTM2D-based pipeline that processes 2D image cutouts across bands and epochs, with a dual-branch architecture that also incorporates temporal context via an LSTM on timestamps. The authors build a realistic training set from HSC data by injecting mock LSNe Ia into LRGs and augmenting negatives from HSC variable sources and simulated unlensed SNe Ia in galaxies, carefully matching cadence and depth to the observations. Results show rapid performance gains, achieving a TPR of over 60% at a false-positive rate of $0.01\%$ by the 7th multi-band observation and over 70% by the 9th, with ROC AUC near unity, and demonstrate clear advantages of multi-band over single-band inputs for early LSNe Ia identification in LSST-like surveys.

Abstract

Strong gravitationally lensed supernovae (LSNe), though rare, are exceptionally valuable probes for cosmology and astrophysics. Upcoming time-domain surveys like the Vera Rubin Observatory's Legacy Survey of Space and Time (LSST) offer a major opportunity to discover them in large numbers. Early identification is crucial for timely follow-up observations. We develop a deep learning pipeline to detect LSNe using multi-band, multi-epoch image cutouts. Our model is based on a 2D convolutional long short-term memory (ConvLSTM2D) architecture, designed to capture both spatial and temporal correlations in time-series imaging data. Predictions are made after each observation in the time series, with accuracy improving as more data arrive. We train the model on realistic simulations derived from Hyper Suprime-Cam (HSC) data, which closely matches LSST in depth and filters. This work focuses exclusively on Type Ia supernovae (SNe Ia). LSNe Ia are injected onto HSC luminous red galaxies (LRGs) at various phases of evolution to create positive examples. Negative examples include variable sources from HSC Transient Survey (including unclassified transients), and simulated unlensed SNe Ia in LRG and spiral galaxies. Our multi-band model shows rapid classification improvements during the initial few observations and quickly reaches high detection efficiency: at a fixed false-positive rate (FPR) of $0.01\%$, the true-positive rate (TPR) reaches $\gtrsim 60\%$ by the 7th observation and exceeds $\gtrsim 70\%$ by the 9th. Among the negative examples, SNe in LRGs remain the primary source of FPR, as they can resemble their lensed counterparts under certain conditions. The model detects quads more effectively than doubles and performs better on systems with larger image separations. Although trained and tested on HSC-like data, our approach applies to any cadenced imaging survey, particularly LSST.

HOLISMOKES XVIII: Detecting strongly lensed SNe Ia from time series of multi-band LSST-like imaging data

TL;DR

This work tackles the challenge of detecting strongly lensed SNe Ia (LSNe Ia) from time-series, multi-band imaging to enable timely follow-up observations. It introduces a ConvLSTM2D-based pipeline that processes 2D image cutouts across bands and epochs, with a dual-branch architecture that also incorporates temporal context via an LSTM on timestamps. The authors build a realistic training set from HSC data by injecting mock LSNe Ia into LRGs and augmenting negatives from HSC variable sources and simulated unlensed SNe Ia in galaxies, carefully matching cadence and depth to the observations. Results show rapid performance gains, achieving a TPR of over 60% at a false-positive rate of by the 7th multi-band observation and over 70% by the 9th, with ROC AUC near unity, and demonstrate clear advantages of multi-band over single-band inputs for early LSNe Ia identification in LSST-like surveys.

Abstract

Strong gravitationally lensed supernovae (LSNe), though rare, are exceptionally valuable probes for cosmology and astrophysics. Upcoming time-domain surveys like the Vera Rubin Observatory's Legacy Survey of Space and Time (LSST) offer a major opportunity to discover them in large numbers. Early identification is crucial for timely follow-up observations. We develop a deep learning pipeline to detect LSNe using multi-band, multi-epoch image cutouts. Our model is based on a 2D convolutional long short-term memory (ConvLSTM2D) architecture, designed to capture both spatial and temporal correlations in time-series imaging data. Predictions are made after each observation in the time series, with accuracy improving as more data arrive. We train the model on realistic simulations derived from Hyper Suprime-Cam (HSC) data, which closely matches LSST in depth and filters. This work focuses exclusively on Type Ia supernovae (SNe Ia). LSNe Ia are injected onto HSC luminous red galaxies (LRGs) at various phases of evolution to create positive examples. Negative examples include variable sources from HSC Transient Survey (including unclassified transients), and simulated unlensed SNe Ia in LRG and spiral galaxies. Our multi-band model shows rapid classification improvements during the initial few observations and quickly reaches high detection efficiency: at a fixed false-positive rate (FPR) of , the true-positive rate (TPR) reaches by the 7th observation and exceeds by the 9th. Among the negative examples, SNe in LRGs remain the primary source of FPR, as they can resemble their lensed counterparts under certain conditions. The model detects quads more effectively than doubles and performs better on systems with larger image separations. Although trained and tested on HSC-like data, our approach applies to any cadenced imaging survey, particularly LSST.

Paper Structure

This paper contains 30 sections, 4 equations, 17 figures, 2 tables.

Figures (17)

  • Figure 1: Observation epochs of the HSC Transient Survey in the COSMOS field across different filters during MJD 57710–57875 (Nov. 2016–Apr. 2017), used to extract image time series of variable sources (HSC variables) with highest available cadence. The number of observations in the $griz$ bands are 5, 9, 12, and 12, respectively. The observation details are taken from Table 1 of yasuda19.
  • Figure 2: Einstein radius ($\theta_\text{E}$) distributions. The blue histogram shows the $\sim$93,000 lens–source pairs with a flat $\theta_\text{E}$ distribution between $0.1\arcsec$ and $2.0\arcsec$. After removing artifacts, $\sim$80,000 clean systems (orange) remain, still following a flat $\theta_\text{E}$ distribution. The green histogram shows the final $\sim$50,000 mock LSNe Ia passing our brightness selection and restricted to $\theta_\text{E} > 0.5\arcsec$, which delimits the region of interest in this study. Of these, $\sim$48,000 are used in the training, validation, and test sets with a balanced number of doubles and quads.
  • Figure 3: Example $i$-band time series of different components used for training: mock LSNe Ia (top row, forming the positive class), and variable sources and normal SNe Ia (middle and bottom rows, forming the negative class). Each row shows two representative samples. The top row includes a quadruply LSN Ia (left) and a doubly lensed one (right). The middle row shows two unrelated HSC variables, where the central objects exhibit variability over time. The bottom row shows an SN Ia in an LRG (left) and in a spiral galaxy (right). Note that, only the time series of the HSC variables in the middle row include PSF variation over time. Timestamps in each frame indicate days since the first detection, which is always set to zero. The classification task is binary: distinguishing LSNe Ia from all other types of transients and bogus detections. For illustration, we show only single-band ($i$-band) time series here. The multi-band time series corresponding to the quadruply LSN Ia shown in the top-left panel is presented in Figure \ref{['fig:multi-band_ts']}.
  • Figure 4: The figure illustrates the architecture of our deep learning model. At each observation epoch, the image cutout is processed through a ConvLSTM2D channel, while the corresponding time value is fed into a standard LSTM channel. The outputs from both channels are then concatenated and passed through a dense layer. The final layer, consisting of a single node with sigmoid activation, outputs a value $\mathcal{P} \in (0,1)$, representing the probability that the sample is lensed at that particular observation epoch. We use the binary-cross-entropy loss function for binary classification. Note the the number of bands is $N_{\rm B}=4$ and $1$ for the multi-band and single-band analyses, respectively.
  • Figure 5: Example multi-band time series of a quadruply LSN Ia, the same system whose $i$-band time series is shown in the top-left panel of Figure \ref{['fig:images']}. At each observation epoch ($N_\text{obs,m-b}$), data are available in only one band; missing-band frames are zero-padded to maintain a uniform input format. The timestamps corresponding to each $N_\text{obs,m-b}$ is shown above the top frames in parentheses (in days, relative to the first detection, which is always set to zero). The band-specific cumulative observation count ($N_{\text{obs},X}$, where $X \in {g, r, i, z}$) is also marked within each observed frame. In the multi-band analysis, the full time series is input as a 4-channel sequence to the multi-band model, while in the single-band case, the observation sequence in the respective band is provided to the corresponding single-band model without the need of any padding. Note that we follow the HSC Transient Survey cadence here, and the multi-band time series for other classes (HSC variables, SNe in galaxies) have statistically similar time sampling.
  • ...and 12 more figures