Table of Contents
Fetching ...

Multiple testing for signal-agnostic searches of new physics with machine learning

Gaia Grosso, Marco Letizia

TL;DR

This work tackles the bias introduced by model selection in ML-based signal-agnostic searches by introducing a multiple-testing framework over hyperparameters within the NPLM, a kernel-based Neyman-Pearson-inspired GoF test. By constructing several tests with different kernel widths and combining their outputs through meta-tests (notably min-$p$), the approach yields a more uniform sensitivity across diverse new-physics signals while maintaining performance close to the best single test. The study demonstrates that min-$p$ is particularly robust for hard-to-detect signals, with other aggregation schemes offering benefits in specific scenarios, at the cost of increased computation that can be mitigated by parallelization. This advances unbiased, model-agnostic anomaly detection in collider physics and suggests a path toward combining complementary signal families with principled statistical control of the look-elsewhere effect.

Abstract

In this work, we address the question of how to enhance signal-agnostic searches by leveraging multiple testing strategies. Specifically, we consider hypothesis tests relying on machine learning, where model selection can introduce a bias towards specific families of new physics signals. We show that it is beneficial to combine different tests, characterised by distinct choices of hyperparameters, and that performances comparable to the best available test are generally achieved while providing a more uniform response to various types of anomalies. Focusing on the New Physics Learning Machine, a methodology to perform a signal-agnostic likelihood-ratio test, we explore a number of approaches to multiple testing, such as combining p-values and aggregating test statistics.

Multiple testing for signal-agnostic searches of new physics with machine learning

TL;DR

This work tackles the bias introduced by model selection in ML-based signal-agnostic searches by introducing a multiple-testing framework over hyperparameters within the NPLM, a kernel-based Neyman-Pearson-inspired GoF test. By constructing several tests with different kernel widths and combining their outputs through meta-tests (notably min-), the approach yields a more uniform sensitivity across diverse new-physics signals while maintaining performance close to the best single test. The study demonstrates that min- is particularly robust for hard-to-detect signals, with other aggregation schemes offering benefits in specific scenarios, at the cost of increased computation that can be mitigated by parallelization. This advances unbiased, model-agnostic anomaly detection in collider physics and suggests a path toward combining complementary signal families with principled statistical control of the look-elsewhere effect.

Abstract

In this work, we address the question of how to enhance signal-agnostic searches by leveraging multiple testing strategies. Specifically, we consider hypothesis tests relying on machine learning, where model selection can introduce a bias towards specific families of new physics signals. We show that it is beneficial to combine different tests, characterised by distinct choices of hyperparameters, and that performances comparable to the best available test are generally achieved while providing a more uniform response to various types of anomalies. Focusing on the New Physics Learning Machine, a methodology to perform a signal-agnostic likelihood-ratio test, we explore a number of approaches to multiple testing, such as combining p-values and aggregating test statistics.
Paper Structure (20 sections, 21 equations, 4 figures, 5 tables)

This paper contains 20 sections, 21 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: EXPO-1D -- Corner plots showing correlations between the p-values obtained from different tests in the background-only hypothesis. The Pearson's correlation ($\rho$) is reported in the legend.
  • Figure 2: EXPO 1D -- Illustrative examples of power curves for NPLM tests performed with different choices of $\sigma$. The left hand panel shows the power curve for a narrow signal, corresponding to the first columns in Table \ref{['tab:1d-z3']}; the right hand panel shows the power curves for a signal in the tail (forth column in Table \ref{['tab:1d-z3']}).
  • Figure 3: MUMU-5D. Power curves for different choices of $\sigma$.
  • Figure 4: LHCO-6D. Power curves for different choices of $\sigma$ (shades of green lines), compared with the one for the $\min$-$p$ aggregation (black line).