Table of Contents
Fetching ...

Multiband neural network classification of ZTF light curves as LSST proxies

Tamás Szklenár, Attila Bódi, Róbert Szabó

TL;DR

This work develops a multiband neural network to classify periodic variable stars using phase-folded light curves from ZTF DR17 as a proxy for LSST data, incorporating the stars' periods as numerical inputs. The model uses two parallel CNNs for the $g$ and $r$ bands, whose image-based outputs are fused with a period-derived pathway to perform joint classification, achieving up to about 99% accuracy on several classes in testing. Data from OGLE-IV, Gaia DR3, and ZTF are cross-matched and augmented to balance five main variable-star types, with period estimation conducted in both single and multiband modes; results show multiband inputs significantly improve performance, particularly for Cepheids and long-period variables. The study demonstrates the feasibility of LSST-scale classification pipelines before LSST data arrive and outlines extensions to incorporate more bands (e.g., $i$, $g$, $r$, $i$, $z$, $y$) and multiband period-search techniques, aided by Gaia-based cross-matches and DP1-era data. The approach yields high-precision, scalable classifications essential for time-domain astronomy in upcoming large surveys.

Abstract

In this project we use data obtained by Zwicky Transient Facility to develop and test a neural-network-based, multiband classification algorithm to classify periodic variable stars (i.e. pulsating variable stars and eclipsing binaries). The aim is to utilize the algorithm on LSST data once they become available. Phase-folded light curve images and period information were used from five different variable star types: Classical and Type II Cepheids, δ Scuti stars, eclipsing binaries, and RR Lyrae stars. The data is taken from the 17th data release of ZTF, from which we used two passbands, g and r in this project. The periods were calculated from the raw data and this information was used as an additional numerical input in the neural network. For the training and testing process a supervised machine learning method was created, the neural network contains Convolutional Neural Networks concatenated with Fully Connected Layers. During the training-validation process the training accuracy reached 99% and the validation accuracy peaked at 95.6%. At the test classification phase three variable star types out of the 5 classes were classified with around 99% of accuracy, the other two also had very high accuracy, 89.6% and 93.6%.

Multiband neural network classification of ZTF light curves as LSST proxies

TL;DR

This work develops a multiband neural network to classify periodic variable stars using phase-folded light curves from ZTF DR17 as a proxy for LSST data, incorporating the stars' periods as numerical inputs. The model uses two parallel CNNs for the and bands, whose image-based outputs are fused with a period-derived pathway to perform joint classification, achieving up to about 99% accuracy on several classes in testing. Data from OGLE-IV, Gaia DR3, and ZTF are cross-matched and augmented to balance five main variable-star types, with period estimation conducted in both single and multiband modes; results show multiband inputs significantly improve performance, particularly for Cepheids and long-period variables. The study demonstrates the feasibility of LSST-scale classification pipelines before LSST data arrive and outlines extensions to incorporate more bands (e.g., , , , , , ) and multiband period-search techniques, aided by Gaia-based cross-matches and DP1-era data. The approach yields high-precision, scalable classifications essential for time-domain astronomy in upcoming large surveys.

Abstract

In this project we use data obtained by Zwicky Transient Facility to develop and test a neural-network-based, multiband classification algorithm to classify periodic variable stars (i.e. pulsating variable stars and eclipsing binaries). The aim is to utilize the algorithm on LSST data once they become available. Phase-folded light curve images and period information were used from five different variable star types: Classical and Type II Cepheids, δ Scuti stars, eclipsing binaries, and RR Lyrae stars. The data is taken from the 17th data release of ZTF, from which we used two passbands, g and r in this project. The periods were calculated from the raw data and this information was used as an additional numerical input in the neural network. For the training and testing process a supervised machine learning method was created, the neural network contains Convolutional Neural Networks concatenated with Fully Connected Layers. During the training-validation process the training accuracy reached 99% and the validation accuracy peaked at 95.6%. At the test classification phase three variable star types out of the 5 classes were classified with around 99% of accuracy, the other two also had very high accuracy, 89.6% and 93.6%.

Paper Structure

This paper contains 21 sections, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Visualization of the difference between sparse and dense sampling of photometric light curves. We chose a classical Cepheid for this comparison. OGLE-GD-CEP-0044 is a relatively bright Cepheid with a period of 5.09 days. The upper graph shows OGLE-IV data for a 3 year long observation window, which has $130$ data points. The middle graph is observational data from the ZTF survey, also in a 3 year long observational window with $119$ data points. The third graph is a 21.77 days long observation from the TESS mission. The light curve contains 14 898 measurements. The time span covered by the TESS observations are presented in the middle graph by two vertical dashed lines.
  • Figure 2: Example gallery of the phase-folded light curve images. The five columns represent the five main variable star types, which were used in this research: Classical Cepheids, $\delta$ Scuti stars, eclipsing binaries, RR Lyrae stars, and Type-II Cepheids. The top row shows g band data and the bottom row shows r band data of the given example of variable star type.
  • Figure 3: Architecture of our neural network which can classify image-based, phase-folded light curves in multiple (in this case two) passbands with additional numerical input (in this case the periods) measured in each passband. From left to right: two light curves for the same variable star in two different passbands, $g$ and $r$, are the image inputs of the neural network. These images are processed by two identical Convolutional Neural Networks (CNN) and then are concatenated together to make classification based only on the image (light curve) information. The additional, numerical inputs are the passband-dependent periods of the given variable star, which also processed in a much simpler, Fully-connected Neural Network. These two inputs and their results are concatenated in the end to make the final classification result.
  • Figure 4: History curves of the neural network show how the training and validation accuracy evolved during the training and validation phases.
  • Figure 5: History curves of the neural network show how the training and validation loss evolved during the training and validation phases.
  • ...and 3 more figures