Strengthening Anomaly Awareness
Adam Banda, Charanjit K. Khosa, Veronica Sanz
TL;DR
Anomaly Awareness (AA) strengthens unsupervised anomaly detection by incorporating minimal supervision into a Variational Autoencoder through a two-stage training: unsupervised background modeling followed by fine-tuning with a small labeled anomaly set using a loss that penalizes accurate reconstructions of anomalies, implemented as $\mathcal{L}(x)=\mathcal{L}_{\text{VAE}}(x)$ for background and $\mathcal{L}(x)=\mathcal{L}_{\text{VAE}}(x)-\lambda_{\text{AA}}\mathcal{E}(x)$ for anomalies, with $\lambda_{\text{AA}}$ ramped up to $\lambda_{\max}$. The approach is validated across MNIST with synthetic anomalies, CICIDS 2017 cyberattack data, LHCO2020 collider data, and SMEFT Higgs production, yielding improved discrimination between normal and anomalous samples (e.g., AUC improvements up to ~0.90 on LHCO2020 and 0.71 on SMEFT). These results demonstrate that small anomaly information, when integrated during training, substantially enhances the generalization of unsupervised anomaly detectors in diverse domains, including high-energy physics scenarios where subtle kinematic deviations may signal new physics.
Abstract
We present a refined version of the Anomaly Awareness framework for enhancing unsupervised anomaly detection. Our approach introduces minimal supervision into Variational Autoencoders (VAEs) through a two-stage training strategy: the model is first trained in an unsupervised manner on background data, and then fine-tuned using a small sample of labeled anomalies to encourage larger reconstruction errors for anomalous samples. We validate the method across diverse domains, including the MNIST dataset with synthetic anomalies, network intrusion data from the CICIDS benchmark, collider physics data from the LHCO2020 dataset, and simulated events from the Standard Model Effective Field Theory (SMEFT). The latter provides a realistic example of subtle kinematic deviations in Higgs boson production. In all cases, the model demonstrates improved sensitivity to unseen anomalies, achieving better separation between normal and anomalous samples. These results indicate that even limited anomaly information, when incorporated through targeted fine-tuning, can substantially improve the generalization and performance of unsupervised models for anomaly detection.
