Table of Contents
Fetching ...

Scavenger hunt: Selection of obscured active galactic nuclei combining multiband optical variability and colors

Demetra De Cicco, Stefano Cavuoti, Maurizio Paolillo, Vincenzo Petrecca, Ylenia Maruccia, Paula Sánchez-Sáez

TL;DR

This work tackles the difficulty of identifying obscured AGN in optical time-domain data by combining variability-based features with static colors in a three-band (g, r, i) RF framework applied to the VST-COSMOS dataset over a 3.3-year baseline. It demonstrates that with a carefully selected feature set, including GP_DRW_tau, eta^e, and multiple colors, obscured AGN can be recovered at $69^{+8}_{-7}\%$ when requiring confirmation in all three bands and at $80^{+10}_{-9}\%$ for two-of-three bands, outperforming color-only or variability-only approaches. The study uses two AGN labeled sets (spec-MIR and main LS) to show that a more diversified sample increases obscured-AGN recall despite lower overall classifier performance, highlighting a robust multi-band strategy for LSST-era surveys. The results have practical implications for LSST's Deep Drilling Fields, suggesting that a combined variability-color criterion can yield a more complete obscured-AGN census and guide follow-up campaigns, with future work extending to additional DDFs and spectroscopic validation. $58^{+9}_{-8}\%$ of all AGN and $69^{+10}_{-8}\%$ of known obscured AGN are recovered with the three-band+color selection when using a conservative threshold, and $80^{+10}_{-9}\%$ recall is achievable with two-band confirmation, underscoring substantial gains from multiband variability analysis in large-scale surveys.

Abstract

As wide-field optical surveys such as Vera Rubin Observatory's Legacy Survey of Space and Time (LSST) begin operations, time-domain astronomy is facing a data revolution, paving the road for new, expanded variability studies. This work leverages the complementary power of optical variability and color selection to identify active galactic nuclei (AGN), focusing on optimizing the identification of obscured AGN, typically more challenging to distinguish from inactive galaxies based on optical variability alone. The analysis is designed to provide valuable insights in the context of performance preview for the LSST, albeit using a scaled-down version of the LSST dataset. We present the first combined AGN selection based on g+r+i band light curves from the VST-COSMOS survey, spanning 3.3 yr. We identify AGN candidates independently in each band using a random forest (RF) classifier trained on features mainly related to optical variability, along with six optical/infrared colors and a morphology indicator. We subsequently merge the three band-specific samples in order to enhance selection purity and reliability. We then focus on defining a subset of features that significantly improve the identification of obscured AGN. The RF classifiers yield a consistent performance across the three bands, highlighting the critical role of contamination. Using the combined three-band plus color selection we successfully recover $58^{+9}_{-8}\%$ of all AGN and $69^{+10}_{-8}\%$ of the known obscured AGN that have been independently confirmed in all three bands. When requiring confirmation in two out of the three bands, these fractions increase to $69^{+10}_{-8}\%$ and $80^{+10}_{-9}\%$, respectively. We also demonstrate that, while combining variability features with colors is crucial to improve obscured AGN selection, relying solely on color features returns a markedly higher contamination rate.

Scavenger hunt: Selection of obscured active galactic nuclei combining multiband optical variability and colors

TL;DR

This work tackles the difficulty of identifying obscured AGN in optical time-domain data by combining variability-based features with static colors in a three-band (g, r, i) RF framework applied to the VST-COSMOS dataset over a 3.3-year baseline. It demonstrates that with a carefully selected feature set, including GP_DRW_tau, eta^e, and multiple colors, obscured AGN can be recovered at when requiring confirmation in all three bands and at for two-of-three bands, outperforming color-only or variability-only approaches. The study uses two AGN labeled sets (spec-MIR and main LS) to show that a more diversified sample increases obscured-AGN recall despite lower overall classifier performance, highlighting a robust multi-band strategy for LSST-era surveys. The results have practical implications for LSST's Deep Drilling Fields, suggesting that a combined variability-color criterion can yield a more complete obscured-AGN census and guide follow-up campaigns, with future work extending to additional DDFs and spectroscopic validation. of all AGN and of known obscured AGN are recovered with the three-band+color selection when using a conservative threshold, and recall is achievable with two-band confirmation, underscoring substantial gains from multiband variability analysis in large-scale surveys.

Abstract

As wide-field optical surveys such as Vera Rubin Observatory's Legacy Survey of Space and Time (LSST) begin operations, time-domain astronomy is facing a data revolution, paving the road for new, expanded variability studies. This work leverages the complementary power of optical variability and color selection to identify active galactic nuclei (AGN), focusing on optimizing the identification of obscured AGN, typically more challenging to distinguish from inactive galaxies based on optical variability alone. The analysis is designed to provide valuable insights in the context of performance preview for the LSST, albeit using a scaled-down version of the LSST dataset. We present the first combined AGN selection based on g+r+i band light curves from the VST-COSMOS survey, spanning 3.3 yr. We identify AGN candidates independently in each band using a random forest (RF) classifier trained on features mainly related to optical variability, along with six optical/infrared colors and a morphology indicator. We subsequently merge the three band-specific samples in order to enhance selection purity and reliability. We then focus on defining a subset of features that significantly improve the identification of obscured AGN. The RF classifiers yield a consistent performance across the three bands, highlighting the critical role of contamination. Using the combined three-band plus color selection we successfully recover of all AGN and of the known obscured AGN that have been independently confirmed in all three bands. When requiring confirmation in two out of the three bands, these fractions increase to and , respectively. We also demonstrate that, while combining variability features with colors is crucial to improve obscured AGN selection, relying solely on color features returns a markedly higher contamination rate.
Paper Structure (11 sections, 3 figures, 7 tables)

This paper contains 11 sections, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Mid-infrared color-color diagram for our sample of 96 obscured AGN. The solid and dashed lines define the region where AGN are typically found after lacy07 and donley, respectively. Filled black dots indicate sources that were classified as AGN in all three bands with an 8/10 threshold, while empty circles indicate sources classified as AGN in one or two bands only (using the same 8/10 threshold in each band), according to the legend in the figure.
  • Figure 2: Color-color diagrams where the obscured AGN that were not identified in any of the three bands (black crosses) tend to place themselves on specific loci. In both panels, the gray triangles represent inactive galaxies from the LS, while black stars represent the stars in the LS. The large violet dots indicate the obscured AGN confirmed in all three bands by at least eight out of ten experiments, while the bright pink squares indicate the ones confirmed in two out of the three bands based on the same threshold. The small green dots stand for all the AGN in the LS that are not labeled as obscured.
  • Figure 3: Distribution of the values obtained from the $r$-band light curves for the Autocor_length feature. The AGN in the LS are split into obscured AGN confirmed in all three bands (deep magenta), in the $r$ band plus either $g$ or $i$ (light magenta), or in the $g$ and $i$ bands, but not in the $r$ (light pink). Dark green indicates all the other AGN, meaning all the AGN in the LS that are not labeled as obscured, while the non-AGN are shown in light green. Black represents the obscured AGN that each classifier consistently failed to identify (that is to say, they were classified as AGN in $<8/10$ experiments per band).