Using machine learning method for variable star classification using the TESS Sectors 1-57 data
Li-Heng Wang, Kai Li, Xiang Gao, Ya-Ni Guo, Guo-You Sun
TL;DR
This work tackles large-scale automated classification of variable stars in TESS 2-minute data (Sectors 1-57) by leveraging Gaia DR3 labels and an interpretable feature set derived from Fourier analysis and phase-folded light curves. A two-stage Random Forest pipeline performs coarse classification into four main types ($EB_s$, pulsations, ROT, non-variables) followed by per-category subclassification, aided by a robust period determination via Generalized Lomb-Scargle ($GLS$) and careful feature extraction. The approach achieves an $OOB$ score of $0.9178$ and produces seven-variable catalogs (EA, EW, CEP, DSCT, RRab, RRc, ROT) with 14092 new discoveries, including 6245 new EB_s; results are validated through visual inspection and cross-matching with Gaia and external catalogs. The dataset-scale, interpretable methodology demonstrates practical potential for building comprehensive variable-star catalogs from space-based surveys, while acknowledging labeling and data-heterogeneity limitations that guide future refinements.
Abstract
The Transiting Exoplanet Survey Satellite (TESS) is a wide-field all-sky survey mission designed to detect Earth-sized exoplanets. After over four years photometric surveys, data from sectors 1-57, including approximately 1,050,000 light curves with a 2-minute cadence, were collected. By cross-matching the data with Gaia's variable star catalogue, we obtained labeled datasets for further analysis. Using a random forest classifier, we performed classification of variable stars and designed distinct classification processes for each subclass, 6770 EA, 2971 EW, 980 CEP, 8347 DSCT, 457 RRab, 404 RRc and 12348 ROT were identified. Each variable star was visually inspected to ensure the reliability and accuracy of the compiled catalog. Subsequently, we ultimately obtained 6046 EA, 3859 EW, 2058 CEP, 8434 DSCT, 482 RRab, 416 RRc, and 9694 ROT, and a total of 14092 new variable stars were discovered.
