BALNet: Deep Learning-Based Detection and Measurement of Broad Absorption Lines in Quasar Spectra
Yangyang Li, Zhijian Luo, Shaohua Zhang, Du Wang, Jianzhen Chen, Zhu Chen, Hubing Xiao, Chenggang Shu
TL;DR
BALNet addresses the scalability challenge of BAL trough detection and velocity measurement in large quasar spectroscopic surveys by combining a 1D‑CNN with Bi‑LSTM to detect BAL troughs and extract their kinematic properties directly from spectra. The model is trained on a large, carefully constructed mock dataset derived from SDSS DR16Q, enabling simultaneous BAL quasar classification and trough velocity estimation. On simulated data, BALNet achieves robust performance (BAL trough detection: ~83% completeness, ~90.7% purity; BAL quasar identification: ~90.8% completeness, ~94.4% purity; velocity metrics with f_out ~9%, σ_NMAD < 0.03, bias < 1e−5) and a high AU‑PRC (~0.92). Applied to 446{,}839 DR16Q spectra (1.5 ≤ z ≤ 5.7), BALNet identifies 91{,}164 BAL quasars (20.4% of the sample), including 25{,}123 newly detected BAL quasars and 8.8% redshifted troughs, demonstrating significant gains in detection efficiency and the ability to map BAL populations across wide velocity ranges. The work also provides public code and catalogs, enabling broader studies of quasar outflows and their evolution.
Abstract
Broad absorption line (BAL) quasars serve as critical probes for understanding active galactic nucleus (AGN) outflows, black hole accretion, and cosmic evolution. To address the limitations of manual classification in large-scale spectroscopic surveys - where the number of quasar spectra is growing exponentially - we propose BALNet, a deep learning approach consisting of a one-dimensional convolutional neural network (1D-CNN) and bidirectional long short-term memory (Bi-LSTM) networks to automatically detect BAL troughs in quasar spectra. BALNet enables both the identification of BAL quasars and the measurement of their BAL troughs. We construct a simulated dataset for training and testing by combining non-BAL quasar spectra and BAL troughs, both derived from SDSS DR16 observations. Experimental results in the testing set show that: (1) BAL trough detection achieves 83.0% completeness, 90.7% purity, and an F1-score of 86.7%; (2) BAL quasar classification achieves 90.8% completeness and 94.4% purity; (3) the predicted BAL velocities agree closely with simulated ground truth labels, confirming BALNet's robustness and accuracy. When applied to the SDSS DR16 data within the redshift range 1.5<z<5.7, at least one BAL trough is detected in 20.4% of spectra. Notably, more than a quarter of these are newly identified sources with significant absorption, 8.8% correspond to redshifted systems, and some narrow/weak absorption features were missed. BALNet greatly improves the efficiency of large-scale BAL trough detection and enables more effective scientific analysis of quasar spectra.
