Classifying white dwarfs from multi-object spectroscopy surveys with machine learning

James Munday; Pier-Emmanuel Tremblay; Ingrid Pelisoli; Thomas Killestein; Julia Martikainen; David Jones; Antoine Bédard; Paulina Sowicka

Classifying white dwarfs from multi-object spectroscopy surveys with machine learning

James Munday, Pier-Emmanuel Tremblay, Ingrid Pelisoli, Thomas Killestein, Julia Martikainen, David Jones, Antoine Bédard, Paulina Sowicka

TL;DR

The paper develops a hybrid approach to automatically classify white dwarf spectral types by combining DESI DR1 spectra with Pan-STARRS photometry and training a neural network with multi-scale spectral features and a metal-pollution head. It demonstrates near-perfect accuracy for the main DA and DB classes, robust performance across other types, and the utility of UMAP for visualizing class structure and identifying outliers. The authors also exploit multi-epoch data to discover three new double-faced white dwarfs and show how ML and dimensionality-reduction tools can flag binary systems for cleaner population analyses. Collectively, the work showcases scalable techniques for batch white dwarf classification, outlier detection, and time-domain spectroscopy in current and future MOS surveys.

Abstract

With tens to hundreds of spectra of white dwarfs being taken each night from multi-object spectroscopic surveys, automated spectral classification is essential as part of efficient data processing. In this study, we design a neural network to classify the spectral type of white dwarfs using a combination of spectra from the Dark Energy Spectroscopic Instrument (DESI) data release~1 and imaging from Pan-STARRS photometry. The trained network has a near 100% accuracy at identifying DA and DB white dwarf spectral types, while having an 85-95% accuracy for identifying all other primary types, including metal pollution. Distinct spectral or photometric features map into separate structures when performing a Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction. Investigating further and looking at multiple epoch spectra, we performed a separate search for objects that have strongly changing spectral signatures using UMAP, discovering 3 new inhomogeneous surface composition ('double-faced') white dwarfs in the process. We lastly show how machine learning has the potential to separate single white dwarfs from double white dwarf binary star systems in a large dataset, ideal for isolating a single star population. The results from all of these techniques show a compelling use of machine learning to boost efficiency in analysing white dwarfs observed in multi-object spectroscopy surveys, at times replacing the need for human-driven spectral classifications. This demonstrates our techniques as powerful tools for batch population analyses, finding outliers as a form of rare subclass detection, and in conducting multi-epoch spectral analyses.

Classifying white dwarfs from multi-object spectroscopy surveys with machine learning

TL;DR

Abstract

Paper Structure (15 sections, 6 figures)

This paper contains 15 sections, 6 figures.

Introduction
Data handling
Spectra
Photometry
Reclassification of spectral type
Neural network
Single star classification
Model performance
Data Visualisation
Validating the usage of the neural network for rarer white dwarf spectral types
Applying the trained neural network to all white dwarfs in DESI DR1
Searching for spectral type changes
Flagging binarity with machine learning
Conclusions
Masked spectral regions in continuum normalisation

Figures (6)

Figure 1: A 5-fold confusion matrix, with results from each individual 80%--20% training--test data split combined into one confusion matrix. Non-empty cells state the number of systems on top of the fraction of objects that fall into the category for each true label. The final column shows the number of systems for which the predicted label does not surpass a 70% confidence level, and these systems are removed from the performance statistics of the model. These results show the approximate performance we can expect of our final model on a new, unseen dataset. The final best-fit model is trained on 100% of the data, hence these numbers can be assumed as a minimum classification accuracy.
Figure 3: A UMAP representation from normalised spectra of the DESI blue arm. UMAP 1 and UMAP 2 in this figure are entirely independent to those labels in Fig. \ref{['fig:UMAPsingleClassifier']}, since the input dataset is distinct. Objects shown here are the 21 344 that passed all spectroscopic data cuts, and were used to identify class structure. Box labels describe the category of objects that fall within. Using these boxes, the full set of approximately 49 682 unique DESI white dwarf candidates were investigated for drastic changes, being where two exposures fall in different boxes, by projection onto this UMAP coordinate space. The two boxes without labels are drawn to cover intermediate positions between the main boxes of interest, where an object could fall if it is transitioning between spectral types.
Figure 4: A figure displaying all normalised, single epoch spectra of the new, inhomogeneous surface composition ('double-faced') white dwarfs identified in Section \ref{['subsec:resultsSpectralTypeChanges']}. Top: WDJ022228.39+283007.72, middle: WDJ091748.20+001041.72, bottom: WDJ213146.85+025518.46. The reduced spectra are in black with a smoothed flux plotted red. The wavelengths corresponding to hydrogen and helium I lines are vertically plotted just above the lower x-axes in blue and green, respectively. The mid-exposure modified BJD is written beneath each spectrum. Blue arm data alone was input for UMAP analysis, while here the full visible spectrum is shown. Spectra are offset, each originally having a continuum normalisation equal to one.
Figure 5: Normalised spectra of the inhomogeneous surface composition white dwarf WDJ022228.39+283007.72 that were obtained on the NOT. The blue and green vertical lines at the bottom correspond to the wavelengths of Balmer and He I lines, respectively. The depth of He I 4471Å changes across each exposures, while other He lines completely appear or disappear depending on phase. Photometry of the source combined with the spectral changes as a function of phase indicate the spin period to be 3.497 hr, 3.051 hr or 4.095 hr, listed in decreasing Lomb Scargle periodogram powers. The mid-exposure Barycentric Julian Date minus 2 460 000 is stated on the right of each spectrum.
Figure 6: Individual spectra of WDJ150218.87+023054.98 -- an object classed as a DAH white dwarf by our single star classifier, which shows a varying Zeeman splitting as a function of a rotational phase and a strong H$\alpha$ emission line variability. The differences were drastic enough for the object to fall into different boxes for different exposures in Fig. \ref{['fig:UMAPdoubleFaced']}. The mid-exposure Modified Barycentric Julian Date is stated above each spectrum.
...and 1 more figures

Classifying white dwarfs from multi-object spectroscopy surveys with machine learning

TL;DR

Abstract

Classifying white dwarfs from multi-object spectroscopy surveys with machine learning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)