Earth-like planet predictor: A machine learning approach
Jeanne Davoult, Romain Eltschinger, Yann Alibert
TL;DR
This work tackles the exoplanet search bottleneck by predicting which stars are most likely to host an Earth-like planet (ELP) using a Random Forest classifier trained on thousands of synthetic planetary systems from the Bern model. By imposing a simple radial-velocity bias and comparing multiple feature strategies, the authors achieve high precision ($\approx 0.99$) on synthetic tests and identify a set of real systems (about $44$) with architectures conducive to stable ELPs, validated through Hill-stability checks. The approach demonstrates the viability of theory-guided machine learning to prioritize observations for future missions like PLATO and LIFE, while acknowledging limitations of the Bern model and biases that warrant further refinement and cross-validation with alternative formation models.
Abstract
Searching for planets analogous to Earth in terms of mass and equilibrium temperature is currently the first step in the quest for habitable conditions outside our Solar System and, ultimately, the search for life in the universe. Future missions such as PLATO or LIFE will begin to detect and characterise these small, cold planets, dedicating significant observation time to them. The aim of this work is to predict which stars are most likely to host an Earth-like planet (ELP) to avoid blind searches, minimises detection times, and thus maximises the number of detections. Using a previous study on correlations between the presence of an ELP and the properties of its system, we trained a Random Forest to recognise and classify systems as 'hosting an ELP' or 'not hosting an ELP'. The Random Forest was trained and tested on populations of synthetic planetary systems derived from the Bern model, and then applied to real observed systems. The tests conducted on the machine learning (ML) model yield precision scores of up to 0.99, indicating that 99% of the systems identified by the model as having ELPs possess at least one. Among the few real observed systems that have been tested, 44 have been selected as having a high probability of hosting an ELP, and a quick study of the stability of these systems confirms that the presence of an Earth-like planet within them would leave them stable. The excellent results obtained from the tests conducted on the ML model demonstrate its ability to recognise the typical architectures of systems with or without ELPs within populations derived from the Bern model. If we assume that the Bern model adequately describes the architecture of real systems, then such a tool can prove indispensable in the search for Earth-like planets. A similar approach could be applied to other planetary system formation models to validate those predictions.
