Table of Contents
Fetching ...

How to pick the best anomaly detector?

Marie Hein, Gregor Kasieczka, Michael Krämer, Louis Moureaux, Alexander Mück, David Shih

TL;DR

The paper tackles the problem of selecting the most sensitive anomaly detector for model-agnostic LHC searches by introducing ARGOS, a fully data-driven metric with a solid theoretical basis that is monotonic with the standard SIC in the ideal background-template limit. Defined as $\text{ARGOS} = \frac{\epsilon_{\text{SR}}}{\sqrt{\epsilon_{\text{BT}}}} - \sqrt{\epsilon_{\text{BT}}}$, ARGOS leverages a background template to enable data-driven working-point optimization without relying on labeled signals. Through comprehensive experiments on LHCO data with three weakly supervised detectors (IAD, CWoLa Hunting, CATHODE) and three classifier families (NN, HGB, AdaBoost), ARGOS consistently outperforms BCE-based selection for hyperparameters, architectures, and epoch choices, and can even guide feature selection. The approach offers a practical, label-free tool for detector tuning in real data and holds potential for broader applicability beyond resonant anomaly detection, while acknowledging limitations when background templates are imperfect. Overall, ARGOS provides a principled, data-driven framework for selecting and tuning anomaly detectors in high-energy physics analyses.

Abstract

Anomaly detection has the potential to discover new physics in unexplored regions of the data. However, choosing the best anomaly detector for a given data set in a model-agnostic way is an important challenge which has hitherto largely been neglected. In this paper, we introduce the data-driven ARGOS metric, which has a sound theoretical foundation and is empirically shown to robustly select the most sensitive anomaly detection model given the data. Focusing on weakly-supervised, classifier-based anomaly detection methods, we show that the ARGOS metric outperforms other model selection metrics previously used in the literature, in particular the binary cross-entropy loss. We explore several realistic applications, including hyperparameter tuning as well as architecture and feature selection, and in all cases we demonstrate that ARGOS is robust to the noisy conditions of anomaly detection.

How to pick the best anomaly detector?

TL;DR

The paper tackles the problem of selecting the most sensitive anomaly detector for model-agnostic LHC searches by introducing ARGOS, a fully data-driven metric with a solid theoretical basis that is monotonic with the standard SIC in the ideal background-template limit. Defined as , ARGOS leverages a background template to enable data-driven working-point optimization without relying on labeled signals. Through comprehensive experiments on LHCO data with three weakly supervised detectors (IAD, CWoLa Hunting, CATHODE) and three classifier families (NN, HGB, AdaBoost), ARGOS consistently outperforms BCE-based selection for hyperparameters, architectures, and epoch choices, and can even guide feature selection. The approach offers a practical, label-free tool for detector tuning in real data and holds potential for broader applicability beyond resonant anomaly detection, while acknowledging limitations when background templates are imperfect. Overall, ARGOS provides a principled, data-driven framework for selecting and tuning anomaly detectors in high-energy physics analyses.

Abstract

Anomaly detection has the potential to discover new physics in unexplored regions of the data. However, choosing the best anomaly detector for a given data set in a model-agnostic way is an important challenge which has hitherto largely been neglected. In this paper, we introduce the data-driven ARGOS metric, which has a sound theoretical foundation and is empirically shown to robustly select the most sensitive anomaly detection model given the data. Focusing on weakly-supervised, classifier-based anomaly detection methods, we show that the ARGOS metric outperforms other model selection metrics previously used in the literature, in particular the binary cross-entropy loss. We explore several realistic applications, including hyperparameter tuning as well as architecture and feature selection, and in all cases we demonstrate that ARGOS is robust to the noisy conditions of anomaly detection.

Paper Structure

This paper contains 25 sections, 5 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Example of metrics tracked throughout a NN training with $N_{sig}=400$ signal events using the default hyperparameters. We show the supervised max SIC metric (left) evaluated on the test set as well as BCE (middle) and max ARGOS (right), both evaluated on the validation set.
  • Figure 2: Anomaly detection performance (max SIC) as a function of the number of signal events $N_{sig}$ after epoch selection. The epoch selection is performed using the supervised benchmark metric max SIC and the two data-driven metrics (max ARGOS and BCE), shown for IAD (left), CWoLa Hunting (middle) and CATHODE (right).
  • Figure 3: Correlation between median anomaly detection performance max SIC and two data-driven metrics max ARGOS (left) and BCE (right) for all 100 hyperparameter sets for CATHODE using the NN classifier at three example signal injections.
  • Figure 4: Anomaly detection performance (max SIC) after hyperparameter optimization for the NN (left), HGB (middle) and AdaBoost (right) classifiers. The optimization is performed using the supervised benchmark metric max SIC and the two data-driven metrics (max ARGOS and BCE), shown for IAD (top), CWoLa Hunting (middle), and CATHODE (bottom).
  • Figure 5: Anomaly detection performance (max SIC) after architecture selection based on the supervised metric max SIC as a benchmark and the two data-driven metrics max ARGOS and BCE for IAD (left), CWoLa Hunting (middle) and CATHODE (right).
  • ...and 2 more figures