Enhancing anomaly detection with topology-aware autoencoders
Vishal S. Ngairangbam, Błażej Rozwoda, Kazuki Sakurai, Michael Spannowsky
TL;DR
The paper tackles anomaly detection in collider data by addressing a fundamental limitation: standard autoencoders with Euclidean latent spaces struggle to faithfully represent non-trivial momentum-space manifolds. It introduces topology-aware autoencoders that embed phase-space distributions onto compact manifolds, notably $S^n$, $S^n \otimes S^m$, and $\mathbb{RP}^2$, and provides explicit constructions to realize these topologies in the latent space. Through toy experiments and a realistic hadronic top-quark decay scenario, the authors show that matching latent-space topology to the data manifold preserves global structure and improves anomaly separation, with four-dimensional non-trivial topologies delivering the best performance in many cases. This work establishes a principled framework for incorporating physical priors into unsupervised learning for robust, topology-consistent anomaly detection in high-energy physics data.
Abstract
Anomaly detection in high-energy physics is essential for identifying new physics beyond the Standard Model. Autoencoders provide a signal-agnostic approach but are limited by the topology of their latent space. This work explores topology-aware autoencoders, embedding phase-space distributions onto compact manifolds that reflect energy-momentum conservation. We construct autoencoders with spherical ($S^n$), product ($S^2 \otimes S^2$), and projective ($\mathbb{RP}^2$) latent spaces and compare their anomaly detection performance against conventional Euclidean embeddings. Our results show that autoencoders with topological priors significantly improve anomaly separation by preserving the global structure of the data manifold and reducing spurious reconstruction errors. Applying our approach to simulated hadronic top-quark decays, we show that latent spaces with appropriate topological constraints enhance sensitivity and robustness in detecting anomalous events. This study establishes topology-aware autoencoders as a powerful tool for unsupervised searches for new physics in particle-collision data.
