The NEMESIS Catalogue of Young Stellar Objects for the Orion Star Formation Complex. I. General description of data curation
J. Roquette, M. Audard, D. Hernandez, I. Gezer, G. Marton, C. Mas, M. Madarász, O. Dionatos
TL;DR
The paper presents the NEMESIS Catalogue of YSOs for the Orion Star Formation Complex, detailing a comprehensive data-curation framework that combines a historical literature compilation with modern photometric and spectroscopic surveys to build a large, panchromatic YSO catalogue. It ingests data from over 200 publications, harmonizes SEDs and infrared classifications, and derives uniform infrared classifications across ~93% of sources, while systematically assessing multiplicity and contamination. The work delivers a 27,879-source database with rich observables (SEDs, IR classes, lithium and gravity indicators, X-ray data, variability, and kinematics) and establishes methods to identify and quantify contaminants (giants, extragalactic sources, and MS stars). It further demonstrates the catalogue’s utility for validating and training youth-diagnosis methods and highlights the RUWE behavior in YSOs as a case study for Gaia astrometric diagnostics. The dataset is already being used by collaborators for morphology studies, deep-learning YSO identification, and SED fitting, and it provides a robust, scalable resource for future star-formation studies in Orion and beyond.
Abstract
The past decade has seen a rise in the use of Machine Learning methods in the study of young stellar evolution. This trend has led to a growing need for a comprehensive database of young stellar objects (YSO) that goes beyond survey-specific biases and that can be employed for training, validation, and refining the physical interpretation of machine learning outcomes. We reviewed the literature focused on the Orion Star Formation complex (OSFC) to compile a thorough catalogue of previously identified YSO candidates in the region including the curation of observables relevant to probe their youth. Starting from the NASA/ADS database, we assembled YSO candidates from more than 200 peer-reviewed publications. We collated data products relevant to the study of YSOs into a dedicated catalogue, which was complemented with data from large photometric and spectroscopic surveys and in the Strasbourg astronomical Data Center. We also added significant value to the catalogue by homogeneously deriving YSO infrared classification labels and through a comprehensive curation of labels concerning sources' multiplicity. Finally, we used a panchromatic approach to derive the probabilities that the sources in our catalogue were contaminant extragalactic sources or giant stars. We present the NEMESIS catalogue of YSOs for the OSFC, which includes data collated for 27879 sources covering the whole mass spectrum and the various stages of pre-Main Sequence evolution from protostars to diskless young stars. The catalogue includes a collection of panchromatic photometric data processed into spectral energy distributions, stellar parameters, infrared classes, equivalent widths of emission lines related to YSOs accretion and star-disk interaction, and absorption lines such as lithium and lines related to source's gravity, X-ray emission observables, photometric variability observables, and multiplicity labels.
