SIGMA: Single Interpolated Generative Model for Anomalies
Ranit Das, David Shih
TL;DR
The paper tackles the computational bottleneck of resonant anomaly detection by proposing SIGMA, which trains a single flow on all data and uses parameter interpolation to the signal region, thereby avoiding per-SR retraining. It builds on conditional flow matching to approximate the background distribution, incorporating a frequency embedding to capture high-frequency components associated with localized signals. The key contributions are the data-driven, all-data background template with fast SR interpolation, a comparative analysis of CR-based and SIGMA interpolation in terms of SIC performance, and substantial training-time gains over prior methods. This approach enables scalable, model-agnostic bump-hunt analyses on large datasets while preserving sensitivity to localized anomalies, with $R_{ m optimal}(x)=\frac{p_{\rm data}(x)}{p_{\rm background}(x)}$ guiding the detection framework.
Abstract
A key step in any resonant anomaly detection search is accurate modeling of the background distribution in each signal region. Data-driven methods like CATHODE accomplish this by training separate generative models on the complement of each signal region, and interpolating them into their corresponding signal regions. Having to re-train the generative model on essentially the entire dataset for each signal region is a major computational cost in a typical sliding window search with many signal regions. Here, we present SIGMA, a new, fully data-driven, computationally-efficient method for estimating background distributions. The idea is to train a single generative model on all of the data and interpolate its parameters in sideband regions in order to obtain a model for the background in the signal region. The SIGMA method significantly reduces the computational cost compared to previous approaches, while retaining a similar high quality of background modeling and sensitivity to anomalous signals.
