Interpretable Boosted Decision Tree Analysis for the Majorana Demonstrator
I. J. Arnquist, F. T. Avignone, A. S. Barabash, C. J. Barton, K. H. Bhimani, E. Blalock, B. Bos, M. Busch, M. Buuck, T. S. Caldwell, Y -D. Chan, C. D. Christofferson, P. -H. Chu, M. L. Clark, C. Cuesta, J. A. Detwiler, Yu. Efremenko, S. R. Elliott, G. K. Giovanetti, M. P. Green, J. Gruszko, I. S. Guinn, V. E. Guiseppe, C. R. Haufe, R. Henning, D. Hervas Aguilar, E. W. Hoppe, A. Hostiuc, M. F. Kidd, I. Kim, R. T. Kouzes, T. E. Lannen, A. Li, J. M. Lopez-Castano, E. L. Martin, R. D. Martin, R. Massarczyk, S. J. Meijer, T. K. Oli, G. Othman, L. S. Paudel, W. Pettus, A. W. P. Poon, D. C. Radford, A. L. Reine, K. Rielage, N. W. Ruof, D. C. Schaper, D. Tedeschi, R. L. Varner, S. Vasilyev, J. F. Wilkerson, C. Wiseman, W. Xu, C. -H. Yu
TL;DR
The study tackles background suppression in $0νββ$ searches with HPGe detectors by training two gradient-boosted decision trees (MSBDT for multi-site events and αBDT for alpha events) on calibration-based DEP/SEP data, bolstered by data augmentation and distribution matching. A SHAP-based interpretability analysis reveals that AvsE, DCR, and drift-time corrections capture the bulk of the ML gains, and it identifies novel background categories that can inform and improve traditional analyses. The ML models achieve competitive or superior background rejection compared to the standard Majorana analyses across PPC and ICPC detectors, while maintaining signal efficiency, and demonstrate a reciprocal relationship where interpretability guides improvements to conventional cuts. The approach scales to LEGEND-1000 and lays groundwork for waveform-level models, enabling data-driven background suppression with transparent physical interpretation and detector-agnostic training capabilities.
Abstract
The Majorana Demonstrator is a leading experiment searching for neutrinoless double-beta decay with high purity germanium detectors (HPGe). Machine learning provides a new way to maximize the amount of information provided by these detectors, but the data-driven nature makes it less interpretable compared to traditional analysis. An interpretability study reveals the machine's decision-making logic, allowing us to learn from the machine to feedback to the traditional analysis. In this work, we have presented the first machine learning analysis of the data from the Majorana Demonstrator; this is also the first interpretable machine learning analysis of any germanium detector experiment. Two gradient boosted decision tree models are trained to learn from the data, and a game-theory-based model interpretability study is conducted to understand the origin of the classification power. By learning from data, this analysis recognizes the correlations among reconstruction parameters to further enhance the background rejection performance. By learning from the machine, this analysis reveals the importance of new background categories to reciprocally benefit the standard Majorana analysis. This model is highly compatible with next-generation germanium detector experiments like LEGEND since it can be simultaneously trained on a large number of detectors.
