Table of Contents
Fetching ...

High-dimensional and Permutation Invariant Anomaly Detection

Vinicius Mikuni, Benjamin Nachman

TL;DR

A permutation-invariant density estimator for particle physics data based on diffusion models, specifically designed to handle variable-length inputs is introduced, and the ratio of learned densities is investigated and compared to those obtained by a supervised classification algorithm.

Abstract

Methods for anomaly detection of new physics processes are often limited to low-dimensional spaces due to the difficulty of learning high-dimensional probability densities. Particularly at the constituent level, incorporating desirable properties such as permutation invariance and variable-length inputs becomes difficult within popular density estimation methods. In this work, we introduce a permutation-invariant density estimator for particle physics data based on diffusion models, specifically designed to handle variable-length inputs. We demonstrate the efficacy of our methodology by utilizing the learned density as a permutation-invariant anomaly detection score, effectively identifying jets with low likelihood under the background-only hypothesis. To validate our density estimation method, we investigate the ratio of learned densities and compare to those obtained by a supervised classification algorithm.

High-dimensional and Permutation Invariant Anomaly Detection

TL;DR

A permutation-invariant density estimator for particle physics data based on diffusion models, specifically designed to handle variable-length inputs is introduced, and the ratio of learned densities is investigated and compared to those obtained by a supervised classification algorithm.

Abstract

Methods for anomaly detection of new physics processes are often limited to low-dimensional spaces due to the difficulty of learning high-dimensional probability densities. Particularly at the constituent level, incorporating desirable properties such as permutation invariance and variable-length inputs becomes difficult within popular density estimation methods. In this work, we introduce a permutation-invariant density estimator for particle physics data based on diffusion models, specifically designed to handle variable-length inputs. We demonstrate the efficacy of our methodology by utilizing the learned density as a permutation-invariant anomaly detection score, effectively identifying jets with low likelihood under the background-only hypothesis. To validate our density estimation method, we investigate the ratio of learned densities and compare to those obtained by a supervised classification algorithm.
Paper Structure (7 sections, 7 equations, 6 figures, 2 tables)

This paper contains 7 sections, 7 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Estimated negative log-likelihood in the model trained exclusively on QCD jets, evaluated on a single jet under multiple permutations of the input particles.
  • Figure 2: Anomaly score for QCD, top quark, and $Z'$ jets evaluated on the model trained exclusively on QCD jet events.
  • Figure 3: Significance improvement characteristic curve for different classes of anomalies investigated in this work.
  • Figure 4: Anomaly score for QCD, top quark, and $Z'$ jets evaluated on the model trained exclusively on top quark jet events.
  • Figure 5: Receiver operating characteristic curve obtained from the unsupervised anomaly detection strategy (left), direct density estimation, and a supervised classifier (right). The density ratio uses the estimated densities from individual diffusion models trained either with QCD or top quark jets. The classifier is trained to separate QCD from top quark jets.
  • ...and 1 more figures