Table of Contents
Fetching ...

Machine-Learning-Based Method for Goodness-of-Fit Test in Amplitude Analysis

Huoyi Hou, Beijiang Liu

TL;DR

This work addresses the challenge of assessing goodness-of-fit in high-dimensional amplitude analyses by reframing it as a two-sample test between data and MC produced from the fit. It proposes a machine-learning anomaly-detection approach using a probabilistic classifier (XGBoost) to detect localized discrepancies in the multi-body phase space, with test statistics based on the Likelihood Ratio Test ($LRT$) and area under the ROC curve ($AUC$) and a bootstrap procedure to estimate the null distribution. Demonstrated on the $J/\psi\to\gamma 4\pi$ channel, the method achieves sensitivity to missing subprocess contributions with signal strengths as small as $\lambda=0.01$ (1%), indicating practical utility for complex amplitude analyses. The approach provides a robust alternative to traditional goodness-of-fit tests in high-dimensional spaces and lays groundwork for incorporating systematics and extending to other multi-body decays.

Abstract

\textbf{Purpose:} Amplitude analysis is a pivotal tool in hadron spectroscopy, fundamentally involving a series of likelihood fits to multi-dimensional experimental distributions. While robust goodness-of-fit tests exist for low-dimensional scenarios, evaluating goodness-of-fit in amplitude analysis remains challenging. \textbf{Methods:} We propose a machine-learning approach using anomaly detection for goodness-of-fit assessment in amplitude analysis. Our method employs a classifier to identify discrepancies between data and fit results in multi-dimensional phase space. \textbf{Results and Conclusion:} Using Monte Carlo simulations of $J/ψ\toγπ^+π^-π^0π^0$ decays, we demonstrate that this method detects contributions from an additional resonance with a signal strength of 1\%. The detection power is sufficient for practical amplitude analyses, where contributions with fit fractions larger than 1\% are typically included in the nominal fit. This approach shows promise for amplitude analyses of multi-body processes.

Machine-Learning-Based Method for Goodness-of-Fit Test in Amplitude Analysis

TL;DR

This work addresses the challenge of assessing goodness-of-fit in high-dimensional amplitude analyses by reframing it as a two-sample test between data and MC produced from the fit. It proposes a machine-learning anomaly-detection approach using a probabilistic classifier (XGBoost) to detect localized discrepancies in the multi-body phase space, with test statistics based on the Likelihood Ratio Test () and area under the ROC curve () and a bootstrap procedure to estimate the null distribution. Demonstrated on the channel, the method achieves sensitivity to missing subprocess contributions with signal strengths as small as (1%), indicating practical utility for complex amplitude analyses. The approach provides a robust alternative to traditional goodness-of-fit tests in high-dimensional spaces and lays groundwork for incorporating systematics and extending to other multi-body decays.

Abstract

\textbf{Purpose:} Amplitude analysis is a pivotal tool in hadron spectroscopy, fundamentally involving a series of likelihood fits to multi-dimensional experimental distributions. While robust goodness-of-fit tests exist for low-dimensional scenarios, evaluating goodness-of-fit in amplitude analysis remains challenging. \textbf{Methods:} We propose a machine-learning approach using anomaly detection for goodness-of-fit assessment in amplitude analysis. Our method employs a classifier to identify discrepancies between data and fit results in multi-dimensional phase space. \textbf{Results and Conclusion:} Using Monte Carlo simulations of decays, we demonstrate that this method detects contributions from an additional resonance with a signal strength of 1\%. The detection power is sufficient for practical amplitude analyses, where contributions with fit fractions larger than 1\% are typically included in the nominal fit. This approach shows promise for amplitude analyses of multi-body processes.

Paper Structure

This paper contains 7 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Workflow for the goodness-of-fit test approach using anomaly detection.
  • Figure 2: Empirical p-value distributions under varying signal strengths for additional resonances in (a)(b) Case 1 and (c)(d) Case 2. (a)(c) Using AUC statistic; (b)(d) Using LRT statistic. The red line ($\lambda=0$) represents the null case.
  • Figure 3: Invariant mass distributions of $4\pi$, $3\pi$ and $2\pi$ for the data and the projects of amplitude analysis with an additional resonance in $3\pi$ ($\lambda = 0.01$). Black dots with error bars represent "exp" (experiment data) and the blue lines represent "fit" (fit results). Those dashed lines represent intensity of each component in the PWA model of "exp". The green lines represent $J/\psi\rightarrow\gamma \eta(1760), \eta(1760)\rightarrow\rho\rho, \rho\rightarrow\pi\pi$. The brown lines represent $J/\psi\rightarrow\gamma f_2(2340), f_2(2340)\rightarrow\rho\rho, \rho\rightarrow\pi\pi$. The purple lines represent $J/\psi\rightarrow\gamma f_0(2100), f_0(2100)\rightarrow\sigma\sigma, \sigma\rightarrow\pi\pi$. The red lines represent the anomaly signal $J/\psi\rightarrow\gamma f_0(2100), f_0(2100)\rightarrow\pi a_1(1260), a_1(1260)\rightarrow\rho\pi, \rho\rightarrow\pi\pi$. The mass distribution of $3\pi$ contains two $\pi^0$ combinations, meaning that each event contains 2 entries in the histogram.