Scavenger hunt: Selection of obscured active galactic nuclei combining multiband optical variability and colors
Demetra De Cicco, Stefano Cavuoti, Maurizio Paolillo, Vincenzo Petrecca, Ylenia Maruccia, Paula Sánchez-Sáez
TL;DR
This work tackles the difficulty of identifying obscured AGN in optical time-domain data by combining variability-based features with static colors in a three-band (g, r, i) RF framework applied to the VST-COSMOS dataset over a 3.3-year baseline. It demonstrates that with a carefully selected feature set, including GP_DRW_tau, eta^e, and multiple colors, obscured AGN can be recovered at $69^{+8}_{-7}\%$ when requiring confirmation in all three bands and at $80^{+10}_{-9}\%$ for two-of-three bands, outperforming color-only or variability-only approaches. The study uses two AGN labeled sets (spec-MIR and main LS) to show that a more diversified sample increases obscured-AGN recall despite lower overall classifier performance, highlighting a robust multi-band strategy for LSST-era surveys. The results have practical implications for LSST's Deep Drilling Fields, suggesting that a combined variability-color criterion can yield a more complete obscured-AGN census and guide follow-up campaigns, with future work extending to additional DDFs and spectroscopic validation. $58^{+9}_{-8}\%$ of all AGN and $69^{+10}_{-8}\%$ of known obscured AGN are recovered with the three-band+color selection when using a conservative threshold, and $80^{+10}_{-9}\%$ recall is achievable with two-band confirmation, underscoring substantial gains from multiband variability analysis in large-scale surveys.
Abstract
As wide-field optical surveys such as Vera Rubin Observatory's Legacy Survey of Space and Time (LSST) begin operations, time-domain astronomy is facing a data revolution, paving the road for new, expanded variability studies. This work leverages the complementary power of optical variability and color selection to identify active galactic nuclei (AGN), focusing on optimizing the identification of obscured AGN, typically more challenging to distinguish from inactive galaxies based on optical variability alone. The analysis is designed to provide valuable insights in the context of performance preview for the LSST, albeit using a scaled-down version of the LSST dataset. We present the first combined AGN selection based on g+r+i band light curves from the VST-COSMOS survey, spanning 3.3 yr. We identify AGN candidates independently in each band using a random forest (RF) classifier trained on features mainly related to optical variability, along with six optical/infrared colors and a morphology indicator. We subsequently merge the three band-specific samples in order to enhance selection purity and reliability. We then focus on defining a subset of features that significantly improve the identification of obscured AGN. The RF classifiers yield a consistent performance across the three bands, highlighting the critical role of contamination. Using the combined three-band plus color selection we successfully recover $58^{+9}_{-8}\%$ of all AGN and $69^{+10}_{-8}\%$ of the known obscured AGN that have been independently confirmed in all three bands. When requiring confirmation in two out of the three bands, these fractions increase to $69^{+10}_{-8}\%$ and $80^{+10}_{-9}\%$, respectively. We also demonstrate that, while combining variability features with colors is crucial to improve obscured AGN selection, relying solely on color features returns a markedly higher contamination rate.
