Deep Positive-Unlabeled Anomaly Detection for Contaminated Unlabeled Data
Hiroshi Takahashi, Tomoharu Iwata, Atsutoshi Kumagai, Yuuki Yamanaka
TL;DR
The paper tackles the problem of contaminated unlabeled data in semi-supervised anomaly detection by proposing a deep positive-unlabeled (PU) framework that couples unbiased PU learning with deep detectors such as autoencoders and DeepSVDD. It develops two concrete instantiations, PUAE and PUSVDD, deriving PU-based empirical risk formulations and stability mechanisms to train without labeled normal data. Empirical results across eight image datasets demonstrate that PUAE and especially PUSVDD consistently outperform traditional unsupervised and semi-supervised baselines, including robustness to the proportion of unlabeled anomalies and unseen anomaly types. The framework is extensible to other detectors and holds practical significance for real-world anomaly tasks where unlabeled data are often contaminated. Future work includes extending the approach to time-series data and other modalities.
Abstract
Semi-supervised anomaly detection, which aims to improve the anomaly detection performance by using a small amount of labeled anomaly data in addition to unlabeled data, has attracted attention. Existing semi-supervised approaches assume that most unlabeled data are normal, and train anomaly detectors by minimizing the anomaly scores for the unlabeled data while maximizing those for the labeled anomaly data. However, in practice, the unlabeled data are often contaminated with anomalies. This weakens the effect of maximizing the anomaly scores for anomalies, and prevents us from improving the detection performance. To solve this problem, we propose the deep positive-unlabeled anomaly detection framework, which integrates positive-unlabeled learning with deep anomaly detection models such as autoencoders and deep support vector data descriptions. Our approach enables the approximation of anomaly scores for normal data using the unlabeled data and the labeled anomaly data. Therefore, without labeled normal data, our approach can train anomaly detectors by minimizing the anomaly scores for normal data while maximizing those for the labeled anomaly data. Experiments on various datasets show that our approach achieves better detection performance than existing approaches.
