Table of Contents
Fetching ...

Consensus based Algorithm for Nonparametric Detection of Star Clusters (CANDiSC)

C O Obasi, J G Fernandez Trincado, M Gomez, D Minniti, J Alonso Garcia, B P L Ferreira, E R Garro, B Dias, R K Saito, B Barbuy, M C Parisi, T Palma, B Tang, M Ortigoza Urdaneta, L D Baravalle, M V Alonso, F Mauro

TL;DR

CANDiSC addresses the challenge of detecting star clusters in the crowded, highly extincted inner Galaxy by combining three nonparametric density estimators—Kernel Density Estimation, DBSCAN, and Nearest-Neighbor Density Estimation—within a consensus framework and applying it to the VVVX survey. The method processes 680 tiles across ≈1100 deg², using a color-magnitude filter and a tessellated spatial grid to identify overdensities that are confirmed only when at least two methods agree. Validation on real VVVX data and synthetic injections shows high astrometric accuracy, a false-positive rate below 5%, and the ability to recover known clusters while revealing about 40 new candidates; the approach is scalable to future surveys and can be enhanced with multi-band priors and Gaia data. Overall, CANDiSC provides a robust, automated, and adaptable tool for mining deep infrared photometric surveys to build a more complete census of Galactic star clusters, especially in regions of high extinction and crowding.

Abstract

Context: The VISTA Variables in the Via Lactea (VVV) and its extension (VVVX) are near-infrared surveys mapping the Galactic bulge and adjacent disk. These data have enabled the discovery of numerous star clusters obscured by high and spatially variable extinction. Most previous searches relied on visual inspection of individual tiles, which is inefficient and biased against faint or low-density systems. Aims: We aim to develop an automated, homogeneous algorithm for systematic cluster detection across different surveys. Here, we apply our method to VVVX data covering low-latitude regions of the Galactic bulge and disk, affected by extinction and crowding. Methods: We introduce the Consensus-based Algorithm for Nonparametric Detection of Star Clusters (CANDiSC), which integrates kernel density estimation, the Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and nearest-neighbour density estimation within a consensus framework. A stellar overdensity is classified as a candidate if identified by at least two of these methods. We apply CANDiSC to 680 tiles in the VVVX PSF photometric catalogue, covering approximately 1100 square degrees. Results: We detect 163 stellar overdensities, of which 118 are known clusters. Cross-matching with recent catalogues yields five additional matches, leaving 40 likely new candidates absent from existing compilations. The estimated false-positive rate is below 5 percent. Conclusions: CANDiSC offers a robust and scalable approach for detecting stellar clusters in deep near-infrared surveys, successfully recovering known systems and revealing new candidates in the obscured and crowded regions of the Galactic plane.

Consensus based Algorithm for Nonparametric Detection of Star Clusters (CANDiSC)

TL;DR

CANDiSC addresses the challenge of detecting star clusters in the crowded, highly extincted inner Galaxy by combining three nonparametric density estimators—Kernel Density Estimation, DBSCAN, and Nearest-Neighbor Density Estimation—within a consensus framework and applying it to the VVVX survey. The method processes 680 tiles across ≈1100 deg², using a color-magnitude filter and a tessellated spatial grid to identify overdensities that are confirmed only when at least two methods agree. Validation on real VVVX data and synthetic injections shows high astrometric accuracy, a false-positive rate below 5%, and the ability to recover known clusters while revealing about 40 new candidates; the approach is scalable to future surveys and can be enhanced with multi-band priors and Gaia data. Overall, CANDiSC provides a robust, automated, and adaptable tool for mining deep infrared photometric surveys to build a more complete census of Galactic star clusters, especially in regions of high extinction and crowding.

Abstract

Context: The VISTA Variables in the Via Lactea (VVV) and its extension (VVVX) are near-infrared surveys mapping the Galactic bulge and adjacent disk. These data have enabled the discovery of numerous star clusters obscured by high and spatially variable extinction. Most previous searches relied on visual inspection of individual tiles, which is inefficient and biased against faint or low-density systems. Aims: We aim to develop an automated, homogeneous algorithm for systematic cluster detection across different surveys. Here, we apply our method to VVVX data covering low-latitude regions of the Galactic bulge and disk, affected by extinction and crowding. Methods: We introduce the Consensus-based Algorithm for Nonparametric Detection of Star Clusters (CANDiSC), which integrates kernel density estimation, the Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and nearest-neighbour density estimation within a consensus framework. A stellar overdensity is classified as a candidate if identified by at least two of these methods. We apply CANDiSC to 680 tiles in the VVVX PSF photometric catalogue, covering approximately 1100 square degrees. Results: We detect 163 stellar overdensities, of which 118 are known clusters. Cross-matching with recent catalogues yields five additional matches, leaving 40 likely new candidates absent from existing compilations. The estimated false-positive rate is below 5 percent. Conclusions: CANDiSC offers a robust and scalable approach for detecting stellar clusters in deep near-infrared surveys, successfully recovering known systems and revealing new candidates in the obscured and crowded regions of the Galactic plane.

Paper Structure

This paper contains 23 sections, 5 equations, 18 figures, 5 tables.

Figures (18)

  • Figure 1: Survey area used in this study is presented. The gray-shaded region indicates the 680 VVVX tiles included in the analysis, while the black-shaded region shows tiles that were excluded.
  • Figure 2: Color-magnitude and spatial distributions of stars in tiles b0411 (M54) upper panel and e0621 (Pismis 2) bottom panel. Left panels: 2D histograms showing all sources in the $(J - K_s)$ vs. $K_s$ diagram. Middle panels: Sources passing the color-magnitude selection ($0.4 < J - K_s < 1.4$, $K_s < 17.5$) are overlaid in red, with the selection boundaries marked by dashed blue lines. Right panels: Spatial distributions in RA vs. Dec, where all stars are shown in gray, and color-selected sources in red. Blue crosses mark the cluster centers, and dashed blue circles indicate a $0.3^\circ$ radius.
  • Figure 3: Flowchart of CANDiSC, the Consensus-based Algorithm for Nonparametric Detection of Star Clusters.
  • Figure 4: Density distribution maps for the VVVX tiles containing clusters used to validate the CANDiSC code. Left panel: Stellar density maps for the fields of M54 (b0411), NGC 6652 (b0436), NGC 6293 (b0490), and NGC6325 (b0492). The upper and lower subpanels correspond to different tiles. Right panel: Same maps as in the left panel, now overplotted with the candidate cluster members identified by CANDiSC. The legend indicates the number of recovered members for each cluster.
  • Figure 5: Same as in Fig. 4, but for CWNU 4193 (e0618), Pismis 2 (e0621), Haffner 15 (e0613), and M 19 (e0503). The maps show the stellar density distributions in each VVVX tile, with the identified cluster members overplotted. The legend reports the number of recovered members for each cluster.
  • ...and 13 more figures