Ameliorating transient noise bursts in gravitational-wave searches for intermediate-mass black holes
Melissa Lopez, Giada Caneva, Ana Martins, Stefano Schmidt, Jonno Schoppink, Wouter van Straalen, Collin Capano, Sarah Caudill
TL;DR
This work tackles the challenge of detecting intermediate-mass black hole mergers in gravitational-wave data by distinguishing genuine signals from glitches. It fuses a GstLAL-style time-domain matched-filtering pipeline with a multi-class multilayer perceptron (MLP) that operates on track-level feature vectors derived from triggers, enabling per-detector classification into astrophysical vs glitch categories. Key innovations include a fixed-length, weighted-average track representation, strategies to handle class imbalance with cross-validation, and enforcing time coincidence across detectors to boost significance. The study reports high accuracy on O3a data and demonstrates substantial generalization to O3b when using time-coincident tracks, along with a ML-informed statistic for significance (Pinj) that improves the false-alarm-rate performance, indicating a practical path toward more robust IMBH searches and potential extension to other compact binary signals.
Abstract
The direct observation of intermediate-mass black holes (IMBH) populations would not only strengthen the possible evolutionary link between stellar and supermassive black holes, but unveil the details of the pair-instability mechanism and elucidate their influence in galaxy formation. Conclusive observation of IMBHs remained elusive until the detection of gravitational-wave (GW) signal GW190521, which lies with high confidence in the mass gap predicted by the pair-instability mechanism. Despite falling in the sensitivity band of current GW detectors, IMBH searches are challenging due to their similarity to transient bursts of detector noise, known as glitches. In this proof-of-concept work, we combine a matched-filter algorithm with a Machine Learning (ML) method to differentiate IMBH signals from non-transient burst noise, known as glitches. In particular, we build a multi-layer perceptron network to perform a multi-class classification of the output triggers of matched-filter. In this way we are able to distinguish simulated GW IMBH signals from different classes of glitches that occurred during the third observing run (O3) {in single detector data}. {We train, validate, and test our model on O3a data, reaching a true positive rate of over $90\%$ for simulated IMBH signals. To test the generalization ability over the evolutionary observing run, we test on the useen data of O3b, which yields a true positive rate of over $70\%$} . We also combine data from multiple detectors to search for simulated IMBH signals in real detector noise, providing a significance measure for the output of our ML method.
