Machine-Learning-Based Method for Goodness-of-Fit Test in Amplitude Analysis
Huoyi Hou, Beijiang Liu
TL;DR
This work addresses the challenge of assessing goodness-of-fit in high-dimensional amplitude analyses by reframing it as a two-sample test between data and MC produced from the fit. It proposes a machine-learning anomaly-detection approach using a probabilistic classifier (XGBoost) to detect localized discrepancies in the multi-body phase space, with test statistics based on the Likelihood Ratio Test ($LRT$) and area under the ROC curve ($AUC$) and a bootstrap procedure to estimate the null distribution. Demonstrated on the $J/\psi\to\gamma 4\pi$ channel, the method achieves sensitivity to missing subprocess contributions with signal strengths as small as $\lambda=0.01$ (1%), indicating practical utility for complex amplitude analyses. The approach provides a robust alternative to traditional goodness-of-fit tests in high-dimensional spaces and lays groundwork for incorporating systematics and extending to other multi-body decays.
Abstract
\textbf{Purpose:} Amplitude analysis is a pivotal tool in hadron spectroscopy, fundamentally involving a series of likelihood fits to multi-dimensional experimental distributions. While robust goodness-of-fit tests exist for low-dimensional scenarios, evaluating goodness-of-fit in amplitude analysis remains challenging. \textbf{Methods:} We propose a machine-learning approach using anomaly detection for goodness-of-fit assessment in amplitude analysis. Our method employs a classifier to identify discrepancies between data and fit results in multi-dimensional phase space. \textbf{Results and Conclusion:} Using Monte Carlo simulations of $J/ψ\toγπ^+π^-π^0π^0$ decays, we demonstrate that this method detects contributions from an additional resonance with a signal strength of 1\%. The detection power is sufficient for practical amplitude analyses, where contributions with fit fractions larger than 1\% are typically included in the nominal fit. This approach shows promise for amplitude analyses of multi-body processes.
