Table of Contents
Fetching ...

Robust and Explainable Detector of Time Series Anomaly via Augmenting Multiclass Pseudo-Anomalies

Kohei Obata, Yasuko Matsubara, Yasushi Sakurai

TL;DR

RedLamp addresses unsupervised time-series anomaly detection under anomaly contamination by adopting a multiclass anomaly assumption and leveraging diverse time-series–specific augmentations to generate multiclass pseudo-anomalies. It learns a multiclass boundary with soft labels through a joint objective of masked reconstruction and cross-entropy losses, and combines reconstruction error with an adjusted anomaly-class score to form a robust anomaly score. Empirical results on five real-world datasets show superior performance and robustness to contamination, along with an explainable latent space that reveals how real anomalies relate to augmentation classes. This approach mitigates the diversity gap and false anomalies inherent in binary-augmentation methods, offering a practical, interpretable framework for TSAD with strong real-world potential.

Abstract

Unsupervised anomaly detection in time series has been a pivotal research area for decades. Current mainstream approaches focus on learning normality, on the assumption that all or most of the samples in the training set are normal. However, anomalies in the training set (i.e., anomaly contamination) can be misleading. Recent studies employ data augmentation to generate pseudo-anomalies and learn the boundary separating the training samples from the augmented samples. Although this approach mitigates anomaly contamination if augmented samples mimic unseen real anomalies, it suffers from several limitations. (1) Covering a wide range of time series anomalies is challenging. (2) It disregards augmented samples that resemble normal samples (i.e., false anomalies). (3) It places too much trust in the labels of training and augmented samples. In response, we propose RedLamp, which employs diverse data augmentations to generate multiclass pseudo-anomalies and learns the multiclass boundary. Such multiclass pseudo-anomalies cover a wide variety of time series anomalies. We conduct multiclass classification using soft labels, which prevents the model from being overconfident and ensures its robustness against contaminated/false anomalies. The learned latent space is inherently explainable as it is trained to separate pseudo-anomalies into multiclasses. Extensive experiments demonstrate the effectiveness of RedLamp in anomaly detection and its robustness against anomaly contamination.

Robust and Explainable Detector of Time Series Anomaly via Augmenting Multiclass Pseudo-Anomalies

TL;DR

RedLamp addresses unsupervised time-series anomaly detection under anomaly contamination by adopting a multiclass anomaly assumption and leveraging diverse time-series–specific augmentations to generate multiclass pseudo-anomalies. It learns a multiclass boundary with soft labels through a joint objective of masked reconstruction and cross-entropy losses, and combines reconstruction error with an adjusted anomaly-class score to form a robust anomaly score. Empirical results on five real-world datasets show superior performance and robustness to contamination, along with an explainable latent space that reveals how real anomalies relate to augmentation classes. This approach mitigates the diversity gap and false anomalies inherent in binary-augmentation methods, offering a practical, interpretable framework for TSAD with strong real-world potential.

Abstract

Unsupervised anomaly detection in time series has been a pivotal research area for decades. Current mainstream approaches focus on learning normality, on the assumption that all or most of the samples in the training set are normal. However, anomalies in the training set (i.e., anomaly contamination) can be misleading. Recent studies employ data augmentation to generate pseudo-anomalies and learn the boundary separating the training samples from the augmented samples. Although this approach mitigates anomaly contamination if augmented samples mimic unseen real anomalies, it suffers from several limitations. (1) Covering a wide range of time series anomalies is challenging. (2) It disregards augmented samples that resemble normal samples (i.e., false anomalies). (3) It places too much trust in the labels of training and augmented samples. In response, we propose RedLamp, which employs diverse data augmentations to generate multiclass pseudo-anomalies and learns the multiclass boundary. Such multiclass pseudo-anomalies cover a wide variety of time series anomalies. We conduct multiclass classification using soft labels, which prevents the model from being overconfident and ensures its robustness against contaminated/false anomalies. The learned latent space is inherently explainable as it is trained to separate pseudo-anomalies into multiclasses. Extensive experiments demonstrate the effectiveness of RedLamp in anomaly detection and its robustness against anomaly contamination.

Paper Structure

This paper contains 36 sections, 7 equations, 9 figures, 5 tables, 1 algorithm.

Figures (9)

  • Figure 1: Comparison of three assumptions. The true boundary represents the border between normal and anomaly samples in the test set that we want to predict. (a) Contaminated anomalies affect the learned boundary. (b) It generates pseudo-anomalies by a single data augmentation. The diversity gap and false anomalies degrade detection performance. (c) It employs diverse data augmentations and fills the diversity gap. Some data augmentation (purple) may be useless and form a pseudo-normal class.
  • Figure 2: (a) Augmentation includes one normal and 11 different types of anomalies. A gray dotted line indicates a sequence before augmentation, and a red background indicates the range of the inserted anomaly. (b) RedLamp trains the model with the augmented training set consisting of augmented instances, one-hot labels, and anomaly masks. The model predicts anomaly-free reconstruction and multiclass labels from the same embedding.
  • Figure 3: Ablation study results.
  • Figure 4: Comparison of multiclass and binary classification.
  • Figure 5: Robustness w.r.t. anomaly contamination.
  • ...and 4 more figures