Table of Contents
Fetching ...

EMU/GAMA: A statistical perspective on active galactic nuclei diagnostics

J. Prathap, A. M. Hopkins, R. Carvajal, M. Cowley, S. M. Croom, D. Farrah, I. Prandoni, S. S. Shabala, J. Th. van Loon, C. Pappalardo, K. A. Pimbblet, U. T. Ahmed, M. Bilicki, M. J. I. Brown, D. Leahy, A. Mailvaganam, J. R. Marvil, T. Mukherjee, S. F. Rahman, T. Vernstrom, J. Willingham, T. Zafar

Abstract

While it is well known that galaxies are composites of many emission processes, quantifying the various contributions remains challenging. In this work, we use unsupervised machine learning based clustering algorithms to evaluate the agreement between the clustering tools and astrophysical classifications, and hence quantify the fractional contributions of star formation processes and nuclear black hole activity to the total galaxy energy budget of radio sources. We perform clustering on the multiwavelength (optical, infrared (IR), and radio) active galactic nuclei (AGN) diagnostic spaces, using the data from the G09 and G23 fields from the Galaxy and Mass Assembly (GAMA) survey, Evolutionary Map of the Universe (EMU) survey, and the Wide-field Infrared Survey Explorer (WISE). We find that the statistical clustering recovers $\approx$ 90 % of the star forming galaxies (SFGs) and $\approx$ 80 % of the AGN. We define a new IR-radio AGN diagnostic scheme that identifies radio AGN from IR SFGs and AGN, corresponding to the KMeans cluster with approximately 90 % reliability. We demonstrate the superior power of radio AGN selection in higher dimensions using a three-dimensional space composed of directly observable parameters ($\rm W_1-W_2$ colour, $\rm W_2$ magnitude, and the 1.4 GHz radio flux density). This novel three dimensional diagnostic shows immense potential in radio AGN selection that is close to 90 % reliable and 90 % complete. We also publish a catalogue of radio sources in the EMU survey with associated probabilities for them to be active in the optical regime, through which we emphasise the philosophy of considering a galaxy to be composed of various fractions rather than a binary classification of SFGs and AGN.

EMU/GAMA: A statistical perspective on active galactic nuclei diagnostics

Abstract

While it is well known that galaxies are composites of many emission processes, quantifying the various contributions remains challenging. In this work, we use unsupervised machine learning based clustering algorithms to evaluate the agreement between the clustering tools and astrophysical classifications, and hence quantify the fractional contributions of star formation processes and nuclear black hole activity to the total galaxy energy budget of radio sources. We perform clustering on the multiwavelength (optical, infrared (IR), and radio) active galactic nuclei (AGN) diagnostic spaces, using the data from the G09 and G23 fields from the Galaxy and Mass Assembly (GAMA) survey, Evolutionary Map of the Universe (EMU) survey, and the Wide-field Infrared Survey Explorer (WISE). We find that the statistical clustering recovers 90 % of the star forming galaxies (SFGs) and 80 % of the AGN. We define a new IR-radio AGN diagnostic scheme that identifies radio AGN from IR SFGs and AGN, corresponding to the KMeans cluster with approximately 90 % reliability. We demonstrate the superior power of radio AGN selection in higher dimensions using a three-dimensional space composed of directly observable parameters ( colour, magnitude, and the 1.4 GHz radio flux density). This novel three dimensional diagnostic shows immense potential in radio AGN selection that is close to 90 % reliable and 90 % complete. We also publish a catalogue of radio sources in the EMU survey with associated probabilities for them to be active in the optical regime, through which we emphasise the philosophy of considering a galaxy to be composed of various fractions rather than a binary classification of SFGs and AGN.
Paper Structure (32 sections, 2 equations, 9 figures, 4 tables)

This paper contains 32 sections, 2 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: The redshift distributions of the radio sources in G09 (solid line) and G23 (dashed line) fields. These radio sources belonging to the optical-radio sample are selected as described in the text (see § \ref{['sec:final_sample']}). The requirement of H$\alpha$ line for BPT classification results in the sudden drop at $z\approx0.34$. A few higher redshift G09 objects are most likely resulting from the higher completeness and slightly fainter magnitude limit of the field.
  • Figure 2: Performance of various clustering tools in different optical diagnostic spaces. The figure shows the different clustering tools row-wise, and the different empirical diagnostic tools are presented along the columns. Panels a-d: the clusters identified by KMeans are plotted in the optical diagnostic spaces BPT diagram, MEx diagram, blue diagram, and CEx diagram, respectively. Panels e-h: the clusters identified by GMM, panels i-l: the clusters identified by FCM, and panels m-p: the clusters identified by BIRCH are plotted in the same order as the empirical diagnostics. The SFG and AGN regions are labelled in the second row. Composite galaxies in these diagnostic plots occupy the region between the demarcation lines in the first three columns, the CEx diagram does not define a composite region. The purple and green clusters correspond to the star forming species, following a metallicity sequence. The orange clusters seemingly occupy the region identified as AGN in each of these diagnostic spaces. Each of these plots features two marginal histograms showing the normalised densities corresponding to the three clusters identified by each of the clustering tools.
  • Figure 3: Performance of various clustering algorithms in different IR and radio diagnostic spaces. The figure shows the different clustering tools row-wise, and the different empirical diagnostic tools are presented along the columns. Panels a-d: the clusters identified by KMeans are plotted in the IR diagnostic spaces defined by 2018ApJS..234...23A, 2012MNRAS.426.3271M, 2012ApJ...754..120M, and the IR-radio diagnostic space defined by 2021ApJ...910...64K, respectively. Panels e-h: the clusters identified by GMM, panels i-l: the clusters identified by FCM, and panels m-p: the clusters identified by BIRCH are plotted in the same order as the empirical IR-radio diagnostics. The clustering seems to be working well only in the case of KMeans (a-d), where we are able to compare the clusters and the empirical classifications. In the panels a-d, the purple cluster represents IR SFGs, the orange cluster represents IR AGN, and the green cluster represents radio AGN (see text for details). Each of these plots features two marginal histograms showing the normalised densities corresponding to the three clusters identified by each of the clustering tools.
  • Figure 4: The distribution of KMeans clusters in various IR-radio spaces. Panel a shows the 2018ApJS..234...23A diagnostic space without the demarcation line, panel b shows the variation of the $\rm W_1-W_2$ colour as a function of $S_{1.4\,{\rm GHz}}$ with the dashed line representing the 2012ApJ...753...30S demarcation between IR SFGs and IR AGN and the solid line at $\log_{10}S_{\rm1.4\,GHz}=-\,2.38\,\rm Jy$ separates the radio AGN from IR sources (see text for details). Panel c shows the $\rm W_2$ magnitude as a function of $S_{1.4\,{\rm GHz}}$, where the solid line (Equation \ref{['eq:line']}) separates the radio AGN from other sources. This plot is different from the 2021ApJ...910...64K diagnostic since they use the $\rm W_3$ flux. The colour scheme follows Figure \ref{['fig:ir_radio_clusters']}, but we are explicitly defining the purple cluster as IR SFGs, the orange cluster as IR AGN, and the green cluster as radio AGN, based on the characteristics evident from the discussions so far. The non normalised density of each of these clusters are shown as marginal densities following the same colour scheme.
  • Figure 5: A three-dimensional IR-radio AGN diagnostic, combining the parameters shown in Figure \ref{['fig:2d_ir_radio']}. The colour scheme of the data points follows Figure \ref{['fig:2d_ir_radio']}, where the purple cluster represents the IR SFGs, the orange cluster represents the IR AGN, and the green cluster represents the radio AGN. The non-normalised densities of these species are shown following the same colour scheme. The grey plane, defined by Equation \ref{['eq:3d_radio_plane']}, separates the radio AGN from the IR sources with the resultant radio AGN being both complete and reliable over 90% (see text for details). The crimson red plane corresponds to the three-dimensional version of the 2012ApJ...753...30S criterion separating the IR SFGs and IR AGN.
  • ...and 4 more figures