Advancing Machine Learning for Stellar Activity and Exoplanet Period Rotation
Fatemeh Fazel Hesar, Bernard Foing, Ana M. Heras, Mojtaba Raouf, Victoria Foing, Shima Javanmardi, Fons J. Verbeek
TL;DR
The paper addresses the challenge of accurately estimating stellar rotation periods from noisy Kepler light curves. It develops a pipeline that blends physics-based initial period estimates with a suite of ML models (DT, RF, KNN, GB) and a Voting Ensemble, complemented by Gaussian Process baselines. The Best-Model Voting Ensemble achieves substantially lower RMSE than individual models and often rivals or exceeds GP performance, highlighting the robustness of ensemble methods for disentangling stellar activity from transit signals. The work improves exoplanet transit characterization and gyrochronology by delivering more reliable rotation-period measurements, with potential impact for future missions and large-scale time-series analyses in stellar astrophysics.
Abstract
This study applied machine learning models to estimate stellar rotation periods from corrected light curve data obtained by the NASA Kepler mission. Traditional methods often struggle to estimate rotation periods accurately due to noise and variability in the light curve data. The workflow involved using initial period estimates from the LS-Periodogram and Transit Least Squares techniques, followed by splitting the data into training, validation, and testing sets. We employed several machine learning algorithms, including Decision Tree, Random Forest, K-Nearest Neighbors, and Gradient Boosting, and also utilized a Voting Ensemble approach to improve prediction accuracy and robustness. The analysis included data from multiple Kepler IDs, providing detailed metrics on orbital periods and planet radii. Performance evaluation showed that the Voting Ensemble model yielded the most accurate results, with an RMSE approximately 50\% lower than the Decision Tree model and 17\% better than the K-Nearest Neighbors model. The Random Forest model performed comparably to the Voting Ensemble, indicating high accuracy. In contrast, the Gradient Boosting model exhibited a worse RMSE compared to the other approaches. Comparisons of the predicted rotation periods to the photometric reference periods showed close alignment, suggesting the machine learning models achieved high prediction accuracy. The results indicate that machine learning, particularly ensemble methods, can effectively solve the problem of accurately estimating stellar rotation periods, with significant implications for advancing the study of exoplanets and stellar astrophysics.
