Table of Contents
Fetching ...

Morphologies for DECaLS Galaxies through a combination of non-parametric indices and machine learning methods: A comprehensive catalog using the Galaxy Morphology Extractor (galmex) code

V. M. Sampaio, Y. Jaffé, C. Lima-Dias, S. Véliz Astudillo, M. Martínez-Marín, H. Méndez-Hernández, R. Herrera-Camus, A. Monachesi

TL;DR

A homogeneous catalog of non--parametric morphological indices for DECaLS galaxies with effective radii larger than 2 arcsec is presented, and the first public catalog of CA[A_S]S+MEGG indices for DECaLS is released, together with galmex.

Abstract

Galaxy morphology encodes key information about formation and evolution. Large imaging surveys require automated, reproducible methods beyond visual inspection. Non--parametric indices provide an useful framework, but their performance must be quantitatively assessed. We present a homogeneous catalog of non--parametric morphological indices for DECaLS galaxies with effective radii larger than 2 arcsec. Our goal is to evaluate the reliability of indices in separating spirals and ellipticals, test their consistency with existing classification schemes, and establish their applicability for the upcoming surveys focused in the southern hemisphere. We developed galmex, a modular Python package for preprocessing images and measuring a variety of non--parametric indices. Using bona-fide spirals and ellipticals as control samples, we assessed the discriminatory power of each index, and compared them with CNN-based T-Types and Galaxy Zoo DECaLS labels. We use the indices as input for a Light Gradient Boosted Machine (LightGBM) to obtain probabilistic classifications. Concentration is the most reliable parameter from the Concentratiom + Asymmetry + Smoothness system (CAS), while asymmetry--based indices (A and S) are limited to detecting disturbed morphologies. MEGG indices (M20, Entropy, Gini, G2) provide stronger separation and trace a gradient with T--Type. By using a simple binary (0/1) label for ellipticals/spirals, classifiers trained on non--parametric indices achieve high accuracy and well--calibrated probabilities, dominated by entropy, concentration, and Gini. We release the first public catalog of CA[A_S]S+MEGG indices for DECaLS, together with galmex. We combine the non-parametric indices with machine learning framework to derive spiral/elliptical separation for galaxies below z~0.15 through a probabilistic approach.

Morphologies for DECaLS Galaxies through a combination of non-parametric indices and machine learning methods: A comprehensive catalog using the Galaxy Morphology Extractor (galmex) code

TL;DR

A homogeneous catalog of non--parametric morphological indices for DECaLS galaxies with effective radii larger than 2 arcsec is presented, and the first public catalog of CA[A_S]S+MEGG indices for DECaLS is released, together with galmex.

Abstract

Galaxy morphology encodes key information about formation and evolution. Large imaging surveys require automated, reproducible methods beyond visual inspection. Non--parametric indices provide an useful framework, but their performance must be quantitatively assessed. We present a homogeneous catalog of non--parametric morphological indices for DECaLS galaxies with effective radii larger than 2 arcsec. Our goal is to evaluate the reliability of indices in separating spirals and ellipticals, test their consistency with existing classification schemes, and establish their applicability for the upcoming surveys focused in the southern hemisphere. We developed galmex, a modular Python package for preprocessing images and measuring a variety of non--parametric indices. Using bona-fide spirals and ellipticals as control samples, we assessed the discriminatory power of each index, and compared them with CNN-based T-Types and Galaxy Zoo DECaLS labels. We use the indices as input for a Light Gradient Boosted Machine (LightGBM) to obtain probabilistic classifications. Concentration is the most reliable parameter from the Concentratiom + Asymmetry + Smoothness system (CAS), while asymmetry--based indices (A and S) are limited to detecting disturbed morphologies. MEGG indices (M20, Entropy, Gini, G2) provide stronger separation and trace a gradient with T--Type. By using a simple binary (0/1) label for ellipticals/spirals, classifiers trained on non--parametric indices achieve high accuracy and well--calibrated probabilities, dominated by entropy, concentration, and Gini. We release the first public catalog of CA[A_S]S+MEGG indices for DECaLS, together with galmex. We combine the non-parametric indices with machine learning framework to derive spiral/elliptical separation for galaxies below z~0.15 through a probabilistic approach.
Paper Structure (28 sections, 7 equations, 18 figures, 3 tables)

This paper contains 28 sections, 7 equations, 18 figures, 3 tables.

Figures (18)

  • Figure 1: Detection completeness as a function of the SExtractor detection threshold $k$ (in units of the background rms) for simulated Sérsic galaxies at different ${\rm S/N}$. Solid lines show the median completeness across $1000$ realizations per ${\rm S/N}$; shaded bands indicate the $1\sigma$ scatter. The vertical dashed line marks our adopted threshold $k=1$, which maintains $\gtrsim70\%$ completeness for ${\rm S/N}\ge8$ while limiting spurious detections.
  • Figure 2: Panel (a): Detection completeness in the $\langle\mu_{\rm 2Re}\rangle$vs. detection threshold. Panels (b), (c), and (d) average difference between true and measured central position, eccentricity, and position angle, respectively, in the same grid as panel (a). We also highlight two different lines: 1) the black dashed line shows the detection threshold adopted in our pipeline; and 2) the red dashed line shows the conservative threshold in surface brightness, such that we can still recover reliable galaxy properties.
  • Figure 3: Distribution of GZ 1 selected spiral and elliptical sub-samples in the $f_{\rm smooth}$ versus $f_{\rm disk}$ (see text for the definition) diagram, according to GZ DECaLS results. The dashed black line shows the expected anti-correlation line.
  • Figure 4: Recovery of characteristic radii across size–flux space. Each panel shows the map of the average absolute difference (in arcsec) between the measured and reference values of a given radius -- $R_{20}$ (a), $R_{50}$ (b), $R_{80}$ (c), $R_{\rm P}$ (d) in the apparent magnitude vs.$R_{\rm e}$. The dashed red lines define the approximate average surface brightness when assuming a circular (q = 1) Sérsic profile. The hatched region above the $\langle \mu_{\rm 2Re}\rangle$ denotes the adopted threshold in this work. Galaxies with surface brightness smaller than 26 mag s$^{-2}$ can yield unreliable shape parameters and characteristic radii. In particular, the hatched region overlaps significantly with the region in which the error in $R_{\rm P}$ exceeds 1 arcsec ($\sim 4$ pixels). This effect is more visible in $R_{\rm P}$ due to it being the most outter radii in comparison to the others, thus being more prone to background contamination.
  • Figure 5: Distribution of Spiral (blue curves) and Elliptical (red curve) galaxies in 2D diagrams combing the different non-parametric indices. In each panel, we also include the overlap between the spiral and elliptical distributions, which is calculated using equations \ref{['eq:OVL_1D']} and \ref{['eq:OVL_2D']} for histograms and 2D diagrams, respectively.
  • ...and 13 more figures