Distribution Prototype Diffusion Learning for Open-set Supervised Anomaly Detection
Fuyun Wang, Tong Zhang, Yuanzhi Wang, Yide Qiu, Xin Liu, Xu Guo, Zhen Cui
TL;DR
This work addresses open-set supervised anomaly detection (OSAD) where anomalies are scarce and normal data are abundant, making robust generalization to unseen anomalies essential.It introduces Distribution Prototype Diffusion Learning (DPDL), which jointly learns multiple Gaussian prototypes and a Schrödinger bridge to map normal features toward a prototype space while diffusing away from anomalies, effectively tightening normal boundaries.To enhance cross-domain generalization, Dispersion Feature Learning (DFL) projects features into a hyperspherical space with a von Mises–Fisher–style dispersion objective, increasing inter-sample separability and improving out-of-distribution detection.Empirical results on nine public datasets show state-of-the-art AUC under both general and hard settings, demonstrating strong generalization to unseen anomalies and robustness in few-shot anomaly scenarios, with the framework achieving notable improvements on several challenging industrial and medical datasets.
Abstract
In Open-set Supervised Anomaly Detection (OSAD), the existing methods typically generate pseudo anomalies to compensate for the scarcity of observed anomaly samples, while overlooking critical priors of normal samples, leading to less effective discriminative boundaries. To address this issue, we propose a Distribution Prototype Diffusion Learning (DPDL) method aimed at enclosing normal samples within a compact and discriminative distribution space. Specifically, we construct multiple learnable Gaussian prototypes to create a latent representation space for abundant and diverse normal samples and learn a Schrödinger bridge to facilitate a diffusive transition toward these prototypes for normal samples while steering anomaly samples away. Moreover, to enhance inter-sample separation, we design a dispersion feature learning way in hyperspherical space, which benefits the identification of out-of-distribution anomalies. Experimental results demonstrate the effectiveness and superiority of our proposed DPDL, achieving state-of-the-art performance on 9 public datasets.
