Deep evolving semi-supervised anomaly detection
Jack Belham, Aryan Bhosale, Samrat Mukherjee, Biplab Banerjee, Fabio Cuzzolin
TL;DR
This work formalizes Continual Semi-Supervised Anomaly Detection (CSAD) and proposes a VAE-based CSSAD framework that utilizes deep generative replay with outlier rejection to mitigate catastrophic forgetting under evolving data streams. By combining semi-supervised learning with continual learning and anomaly detection, it shows that EVT-inspired latent-space outlier rejection can yield competitive or superior AUC performance on MNIST and Fashion-MNIST, and remains robust on CIFAR-10. The approach uses a jointly trained encoder, classifier, and decoder with KL, ELBO, and reconstruction losses, augmented by a per-class Weibull model to filter replay samples. Extensive ablations explore how labelled/unlabelled data distribution, anomaly labeling, and unlabelled anomaly proportions affect performance, highlighting practical considerations for real-world deployment. Overall, the study lays groundwork for CSAD and points to architecture improvements and multi-modal extensions as promising directions for scalable, robust anomaly detection in dynamic environments.
Abstract
The aim of this paper is to formalise the task of continual semi-supervised anomaly detection (CSAD), with the aim of highlighting the importance of such a problem formulation which assumes as close to real-world conditions as possible. After an overview of the relevant definitions of continual semi-supervised learning, its components, anomaly detection extension, and the training protocols; the paper introduces a baseline model of a variational autoencoder (VAE) to work with semi-supervised data along with a continual learning method of deep generative replay with outlier rejection. The results show that such a use of extreme value theory (EVT) applied to anomaly detection can provide promising results even in comparison to an upper baseline of joint training. The results explore the effects of how much labelled and unlabelled data is present, of which class, and where it is located in the data stream. Outlier rejection shows promising initial results where it often surpasses a baseline method of Elastic Weight Consolidation (EWC). A baseline for CSAD is put forward along with the specific dataset setups used for reproducability and testability for other practitioners. Future research directions include other CSAD settings and further research into efficient continual hyperparameter tuning.
