Table of Contents
Fetching ...

Machine learning for exoplanet detection in high-contrast spectroscopy Combining cross correlation maps and deep learning on medium-resolution integral-field spectra

Rakesh Nath-Ranga, Olivier Absil, Valentin Christiaens, Emily O. Garvin

TL;DR

It is demonstrated that ML techniques have the potential to improve the detection limits and reduce false positives for directly imaged planets in IFS datasets, after transforming the spectral dimension into a radial velocity dimension through a cross-correlation operation and that the presence of the temporal dimension does not lead to increased sensitivity.

Abstract

The advent of high-contrast imaging instruments combined with medium-resolution spectrographs allows spectral and temporal dimensions to be combined with spatial dimensions to detect and potentially characterize exoplanets with higher sensitivity. We develop a new method to effectively leverage the spectral and spatial dimensions in integral-field spectroscopy (IFS) datasets using a supervised deep-learning algorithm to improve the detection sensitivity to high-contrast exoplanets. We begin by applying a data transform whereby the IFS datasets are replaced by cross-correlation coefficient tensors obtained by cross-correlating our data with young gas giant spectral template spectra. This transformed data is then used to train machine learning (ML) algorithms. We train a 2D CNN and 3D LSTM with our data. We compare the ML models with a non-ML algorithm, based on the STIM map of arXiv:1810.06895. We test our algorithms on simulated young gas giants in a dataset that contains no known exoplanet, and explore the sensitivity of algorithms to detect these exoplanets at contrasts ranging from 1e-3 to 1e-4 at different radial separations. We quantify the sensitivity using modified receiver operating characteristic curves (mROC). We discover that the ML algorithms produce fewer false positives and have a higher true positive rate than the STIM-based algorithm, and the true positive rate of ML algorithms is less impacted by changing radial separation. We discover that the velocity dimension is an important differentiating factor. Through this paper, we demonstrate that ML techniques have the potential to improve the detection limits and reduce false positives for directly imaged planets in IFS datasets, after transforming the spectral dimension into a radial velocity dimension through a cross-correlation operation.

Machine learning for exoplanet detection in high-contrast spectroscopy Combining cross correlation maps and deep learning on medium-resolution integral-field spectra

TL;DR

It is demonstrated that ML techniques have the potential to improve the detection limits and reduce false positives for directly imaged planets in IFS datasets, after transforming the spectral dimension into a radial velocity dimension through a cross-correlation operation and that the presence of the temporal dimension does not lead to increased sensitivity.

Abstract

The advent of high-contrast imaging instruments combined with medium-resolution spectrographs allows spectral and temporal dimensions to be combined with spatial dimensions to detect and potentially characterize exoplanets with higher sensitivity. We develop a new method to effectively leverage the spectral and spatial dimensions in integral-field spectroscopy (IFS) datasets using a supervised deep-learning algorithm to improve the detection sensitivity to high-contrast exoplanets. We begin by applying a data transform whereby the IFS datasets are replaced by cross-correlation coefficient tensors obtained by cross-correlating our data with young gas giant spectral template spectra. This transformed data is then used to train machine learning (ML) algorithms. We train a 2D CNN and 3D LSTM with our data. We compare the ML models with a non-ML algorithm, based on the STIM map of arXiv:1810.06895. We test our algorithms on simulated young gas giants in a dataset that contains no known exoplanet, and explore the sensitivity of algorithms to detect these exoplanets at contrasts ranging from 1e-3 to 1e-4 at different radial separations. We quantify the sensitivity using modified receiver operating characteristic curves (mROC). We discover that the ML algorithms produce fewer false positives and have a higher true positive rate than the STIM-based algorithm, and the true positive rate of ML algorithms is less impacted by changing radial separation. We discover that the velocity dimension is an important differentiating factor. Through this paper, we demonstrate that ML techniques have the potential to improve the detection limits and reduce false positives for directly imaged planets in IFS datasets, after transforming the spectral dimension into a radial velocity dimension through a cross-correlation operation.
Paper Structure (23 sections, 2 equations, 7 figures)

This paper contains 23 sections, 2 equations, 7 figures.

Figures (7)

  • Figure 1: Detection maps obtained with different algorithms, where a young gas giant was inserted at a specific contrast at different separations from the frame centre. Column 1 corresponds to a classical ADI algorithm, where an intensity map is computed for each wavelength, and the S/N map is derived from the median of all the ADI-wavelength maps. Column 2 is obtained by computing an S/N map on the cross-correlation map obtained with $v=0$ km/s. Column 3 represents the STCM map, produced as described in the text. The rows represent maps produced when the same companion is inserted at different radial separations from the frame centre, respectively at $2.3$, $3.3$, and $4.3$ FWHM. The simulated young gas giant is inserted at the same contrast of $5\times10^{-4}$ in all maps.
  • Figure 2: Schematic of the C3PO (left) and C-LANDO (right) architectures, showing the different layers and sizes of the input, the dilation of this input at different layers, and finally the output format. The top grey block represents the input, with dimensions printed next to the arrow (e.g., for C3PO: $20$ velocity bins, $11\times11$ pixel image size as explained in Sect. \ref{['sec: training ML']}, and batch size given as input labelled with a question mark). Each block represents a layer and has three parts: the blue or black part represents the type of neuron, the white part is the number of neurons represented by a convolutional kernel size and number of bias units, and the red part represents the non-linearity (hyperbolic tangent). The kernel has four dimensions: the first three represent the input shape and the last represents the depth of the kernel. The kernel shape also represents the output dimensions of a layer and the input to its following layer. Between each layer we have pooling layers marked in green, a flatten layer in brown, and a dropout layer in the end. The output is just a single neuron with a sigmoid activation (dense_1).
  • Figure 3: Evolution of spatial crops with respect to radial velocity. The rows indicate the different velocities and each square shows the spatial noise and signal diversity of the patch. This is just a single spatio-temporal sample (i.e. one pixel from a single temporal frame). The colour-coded pixel value corresponds to the cross-correlation signal from Eq. \ref{['eq:CC equation']}.
  • Figure 4: Illustration of the TP and FP counting process. Column 1 shows the detection map used to produce four different binary maps, thresholded at different intensity levels (columns 2-5), for the three algorithms (respectively STCM, C3PO, and C-LANDO from top to bottom). A fake companion was inserted at position $\Delta{\rm Dec}=0.0$ and $\Delta{\rm RA} = -0.2$. TPs and FPs are respectively shown in green and in red for the thresholds indicated in the titles of each image. Each red blob or point is counted as a single FP or TP, with blobs representing many connected pixels at the same binary intensity level.
  • Figure 5: mROC curves produced by inserting a total of $40$ exoplanets for a set of contrasts ($6 \times 10^{-4}$, $4 \times 10^{-4}$ and $2 \times 10^{-4}$ from top to bottom) at annuli of different separations in the ranges $3-4$, $2-3$, and $1-2$ FWHM (left, middle, and right columns, respectively). The x-axis is the mean FP per map and the y-axis corresponds to the TPR computed using Eq. \ref{['eq:TPR']}. The numbers next to each data point correspond to the threshold that was applied to compute the number of TPs and FPs for each binary map (as described in Fig. \ref{['fig:sample_detmaps']}).
  • ...and 2 more figures