Distinguishing a planetary transit from false positives: a Transformer-based classification for planetary transit signals
Helem Salinas, Karim Pichara, Rafael Brahm, Francisco Pérez-Galarce, Domingo Mery
TL;DR
This paper tackles the challenge of distinguishing true exoplanet transits from false positives in large TESS light-curve datasets. It introduces a Transformer-based classifier with three encoders for local and global flux views plus stellar/transit parameters, leveraging self-attention and attention maps for interpretability. The model achieves competitive performance relative to CNN-based methods and demonstrates that incorporating centroid information improves both accuracy and interpretability. The work highlights the practical potential of attention-based models for efficient, interpretable screening of exoplanet candidates in large-scale surveys, with future work aimed at handling longer light curves and end-to-end classification.
Abstract
Current space-based missions, such as the Transiting Exoplanet Survey Satellite (TESS), provide a large database of light curves that must be analysed efficiently and systematically. In recent years, deep learning (DL) methods, particularly convolutional neural networks (CNN), have been used to classify transit signals of candidate exoplanets automatically. However, CNNs have some drawbacks; for example, they require many layers to capture dependencies on sequential data, such as light curves, making the network so large that it eventually becomes impractical. The self-attention mechanism is a DL technique that attempts to mimic the action of selectively focusing on some relevant things while ignoring others. Models, such as the Transformer architecture, were recently proposed for sequential data with successful results. Based on these successful models, we present a new architecture for the automatic classification of transit signals. Our proposed architecture is designed to capture the most significant features of a transit signal and stellar parameters through the self-attention mechanism. In addition to model prediction, we take advantage of attention map inspection, obtaining a more interpretable DL approach. Thus, we can identify the relevance of each element to differentiate a transit signal from false positives, simplifying the manual examination of candidates. We show that our architecture achieves competitive results concerning the CNNs applied for recognizing exoplanetary transit signals in data from the TESS telescope. Based on these results, we demonstrate that applying this state-of-the-art DL model to light curves can be a powerful technique for transit signal detection while offering a level of interpretability.
