Robust Anomaly Detection for Particle Physics Using Multi-Background Representation Learning
Abhijith Gandrakota, Lily Zhang, Aahlad Puli, Kyle Cranmer, Jennifer Ngadiuba, Rajesh Ranganath, Nhan Tran
TL;DR
This work tackles anomaly detection in high-energy physics by introducing robust multi-background representation learning. By training representations that distinguish multiple background processes and enforcing decorrelation with respect to a search variable $z$ via a NuRD-based objective, the method yields anomaly scores that are more robust to background-specific biases. The authors implement two scores, max logit (ML) and Mahalanobis distance (MD), and demonstrate improvements over a single-background VAE baseline on LHC jet data, including higher AUROC and reduced mass sculpting. The approach offers increased discovery potential for new physics by leveraging richer background information and stronger decorrelation guarantees, with practical implications for bump-hunt analyses in particle experiments.
Abstract
Anomaly, or out-of-distribution, detection is a promising tool for aiding discoveries of new particles or processes in particle physics. In this work, we identify and address two overlooked opportunities to improve anomaly detection for high-energy physics. First, rather than train a generative model on the single most dominant background process, we build detection algorithms using representation learning from multiple background types, thus taking advantage of more information to improve estimation of what is relevant for detection. Second, we generalize decorrelation to the multi-background setting, thus directly enforcing a more complete definition of robustness for anomaly detection. We demonstrate the benefit of the proposed robust multi-background anomaly detection algorithms on a high-dimensional dataset of particle decays at the Large Hadron Collider.
