Table of Contents
Fetching ...

Danger Zone: Establishing Buffers for Enhanced Classification in BPT Diagrams

Changhyun Cho, Ahmad Nemer, Ivan Yu. Katkov, Joseph D. Gelfand

TL;DR

We tackle cross-diagram misclassification in optical emission-line diagnostics by applying Uniform Manifold Approximation and Projection (UMAP) to a four-line-ratio feature set: $[O III]/H\beta$, $[N II]/H\alpha$, $[S II]/H\alpha$, and $[O I]/H\alpha$. Using ~1.3 million spaxels from MaNGA across 6,439 galaxies, trained on a clean subset where all three BPT diagrams agree, the method reveals cluster structure that mirrors traditional BPT regions while exposing boundary buffers where ambiguities arise. We further test alternative demarcations based on velocity dispersion and find persistent cross-diagram inconsistencies, reinforcing the value of a data-driven boundary-aware approach. The framework provides a scalable, unsupervised path to robustly classify ionization sources, identify physically interesting subpopulations among ambiguous spectra, and guide future incorporation of additional observables such as velocity dispersions to improve accuracy.

Abstract

This study utilizes unsupervised machine learning, specifically the uniform manifold approximation and projection (UMAP) algorithm, to classify optical spectra originating from star-forming regions, Seyferts, and low-ionization (nuclear) emission-line regions (LI(N)ERs) based on their line ratios. Typically, the ionization source of a region is determined from intensity ratio of different combinations of pairs of spectral lines. However, using current boundary definitions, $\sim10$\% of spectra change classes between diagnostic diagrams. We apply the machine learning technique to $\sim$1.3 million optical spectra from 6,439 galaxies observed in the MaNGA survey. By training UMAP on consistently classified data, we can classify these ``ambiguous'' spectra, and delineate boundary zones where such ambiguities arise. Furthermore, we identify physically interesting subsets within the ambiguous spectra. Future work will incorporate additional parameters, such as alternative emission line ratios and velocity dispersions, to enhance classification accuracy.

Danger Zone: Establishing Buffers for Enhanced Classification in BPT Diagrams

TL;DR

We tackle cross-diagram misclassification in optical emission-line diagnostics by applying Uniform Manifold Approximation and Projection (UMAP) to a four-line-ratio feature set: , , , and . Using ~1.3 million spaxels from MaNGA across 6,439 galaxies, trained on a clean subset where all three BPT diagrams agree, the method reveals cluster structure that mirrors traditional BPT regions while exposing boundary buffers where ambiguities arise. We further test alternative demarcations based on velocity dispersion and find persistent cross-diagram inconsistencies, reinforcing the value of a data-driven boundary-aware approach. The framework provides a scalable, unsupervised path to robustly classify ionization sources, identify physically interesting subpopulations among ambiguous spectra, and guide future incorporation of additional observables such as velocity dispersions to improve accuracy.

Abstract

This study utilizes unsupervised machine learning, specifically the uniform manifold approximation and projection (UMAP) algorithm, to classify optical spectra originating from star-forming regions, Seyferts, and low-ionization (nuclear) emission-line regions (LI(N)ERs) based on their line ratios. Typically, the ionization source of a region is determined from intensity ratio of different combinations of pairs of spectral lines. However, using current boundary definitions, \% of spectra change classes between diagnostic diagrams. We apply the machine learning technique to 1.3 million optical spectra from 6,439 galaxies observed in the MaNGA survey. By training UMAP on consistently classified data, we can classify these ``ambiguous'' spectra, and delineate boundary zones where such ambiguities arise. Furthermore, we identify physically interesting subsets within the ambiguous spectra. Future work will incorporate additional parameters, such as alternative emission line ratios and velocity dispersions, to enhance classification accuracy.

Paper Structure

This paper contains 5 sections.