Table of Contents
Fetching ...

Semi-Supervised Learning for Lensed Quasar Detection

David Sweeney, Alberto Krone-Martins, Daniel Stern, Peter Tuthill, Richard Scalzo, George Djorgovski, Christine Ducourant, Ashish Mahabal, Ramachrisna Teixeira, Matthew Graham

Abstract

Lensed quasars are key to many areas of study in astronomy, offering a unique probe into the intermediate and far universe. However, finding lensed quasars has proved difficult despite significant efforts from large collaborations. These challenges have limited catalogues of confirmed lensed quasars to the hundreds, despite theoretical predictions that they should be many times more numerous. We train machine learning classifiers to discover lensed quasar candidates. By using semi-supervised learning techniques we leverage the large number of potential candidates as unlabelled training data alongside the small number of known objects, greatly improving model performance. We present our two most successful models: (1) a variational autoencoder trained on millions of quasars to reduce the dimensionality of images for input to a dense neural network classifier that can make accurate predictions and (2) a convolutional neural network trained on a mix of labelled and unlabelled data via virtual adversarial training. These models are both capable of producing high-quality candidates, as evidenced by our discovery of GRALJ140833.73+042229.98. The success of our classifier, which uses only multi-band images, is particularly exciting as it can be combined with existing classifiers, which use other data than images, to improve the classifications of both models and discover more lensed quasars.

Semi-Supervised Learning for Lensed Quasar Detection

Abstract

Lensed quasars are key to many areas of study in astronomy, offering a unique probe into the intermediate and far universe. However, finding lensed quasars has proved difficult despite significant efforts from large collaborations. These challenges have limited catalogues of confirmed lensed quasars to the hundreds, despite theoretical predictions that they should be many times more numerous. We train machine learning classifiers to discover lensed quasar candidates. By using semi-supervised learning techniques we leverage the large number of potential candidates as unlabelled training data alongside the small number of known objects, greatly improving model performance. We present our two most successful models: (1) a variational autoencoder trained on millions of quasars to reduce the dimensionality of images for input to a dense neural network classifier that can make accurate predictions and (2) a convolutional neural network trained on a mix of labelled and unlabelled data via virtual adversarial training. These models are both capable of producing high-quality candidates, as evidenced by our discovery of GRALJ140833.73+042229.98. The success of our classifier, which uses only multi-band images, is particularly exciting as it can be combined with existing classifiers, which use other data than images, to improve the classifications of both models and discover more lensed quasars.

Paper Structure

This paper contains 22 sections, 4 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Six images of lensed quasars showing data from DESI. The top two images show the archetypal lensed quasar as it would be described to a student: two identically blue-white quasars on either side of a red lensing galaxy, which is often extended. This typical difference in colouring between the quasar and lensing galaxy is the reason multi-band images are so crucial --- to differentiate between other astrophysical phenomena. The middle two images show examples of quadruply lensed quasars, four blue-white quasars arranged in a kite. Sometimes, as in the left image, the lensing galaxy is barely or not at all visible and other times, as in the right image, the lensing galaxy obscures one of the quasar images. The bottom two images show more typical lensed quasars, two blue-white quasars, often with the lensing galaxy too dim to be seen and commonly one of the quasar images is reddened, presumably by the lensing galaxy. Note that the images in this paper are comprised of the $g$, $r$ and $i$ bands and visualised in images as the r, g, b channels of a standard image. Unless otherwise specified, the images are 16x16 arcsec of sky.
  • Figure 2: Six images of lensed quasars showing data from Pan-STARRS and DESI. The top two images show the same lensed quasar imaged by both Pan-STARRS and DESI surveys. The middle two images show the kind of noise evident in Pan-STARRS and DESI images. The bottom left image shows an case where no objects are visible in a Pan-STARRS image. The bottom right image shows a typical example of a visual artefact that you may see in DESI images.
  • Figure 3: Diagram showing the architecture for the autoencoder-classifier model where z depicts that latent space of the autoencoder. The autoencoder architecture and training is described in Sections \ref{['sec:autoencoder']} and \ref{['sec:b-VAE']}. The noise metric is described in Section \ref{['sec:noise-metric']}. The traditional classifiers and metadata are described in Section \ref{['sec:traditional-classifiers']}.
  • Figure 4: Diagram showing the architecture for the VAT model, as described in Section \ref{['sec:VAT']}.
  • Figure 5: Reconstructions of two (previously unseen by the model) lensed quasars generated by autoencoders with varying bottlenecks, including the best performing autoencoder (described at the end of Section \ref{['sec:classification']}). Note that as the dimensionality of the latent space increases the reconstruction fidelity improves, however this often does not translate to increased performance for the classifier as the increased dimensionality of the latent space can reduce performance. These images represent 8x8 arcsec of sky.
  • ...and 4 more figures