Position: Quo Vadis, Unsupervised Time Series Anomaly Detection?

M. Saquib Sarfraz; Mei-Yen Chen; Lukas Layer; Kunyu Peng; Marios Koulakis

Position: Quo Vadis, Unsupervised Time Series Anomaly Detection?

M. Saquib Sarfraz, Mei-Yen Chen, Lukas Layer, Kunyu Peng, Marios Koulakis

TL;DR

This paper critically assesses unsupervised time-series anomaly detection, arguing that current state-of-the-art deep learning approaches offer limited gains due to flawed evaluation protocols and weak benchmarking. It introduces simple baselines and basic neural baselines, demonstrating that these can match or surpass complex models, and shows that many deep models behave like linear detectors when distilled. Through comprehensive ablations (normalization, PCA dimension) and analysis of learned functions, the work highlights that current datasets and metrics may overstate progress and that simpler, interpretable methods deserve more attention. The authors advocate for richer datasets and rigorous, multi-faceted benchmarking to drive meaningful advances in TAD tooling and practice.

Abstract

The current state of machine learning scholarship in Timeseries Anomaly Detection (TAD) is plagued by the persistent use of flawed evaluation metrics, inconsistent benchmarking practices, and a lack of proper justification for the choices made in novel deep learning-based model designs. Our paper presents a critical analysis of the status quo in TAD, revealing the misleading track of current research and highlighting problematic methods, and evaluation practices. Our position advocates for a shift in focus from solely pursuing novel model designs to improving benchmarking practices, creating non-trivial datasets, and critically evaluating the utility of complex methods against simpler baselines. Our findings demonstrate the need for rigorous evaluation protocols, the creation of simple baselines, and the revelation that state-of-the-art deep anomaly detection models effectively learn linear mappings. These findings suggest the need for more exploration and development of simple and interpretable TAD methods. The increment of model complexity in the state-of-the-art deep-learning based models unfortunately offers very little improvement. We offer insights and suggestions for the field to move forward. Code: https://github.com/ssarfraz/QuoVadisTAD

Position: Quo Vadis, Unsupervised Time Series Anomaly Detection?

TL;DR

Abstract

Paper Structure (22 sections, 5 equations, 5 figures, 12 tables)

This paper contains 22 sections, 5 equations, 5 figures, 12 tables.

Introduction
Related Work
Methods
Preliminaries
Proposed simple and effective baselines
Proposed neural network blocks baselines
Univariate time series representation
Evaluation metrics
Analysis
Model setup
Model performance overview
Analysis of the deep models learned function
Ablation: Impact of normalization
Ablation: PCA Error projection dimension
Quo vadis
...and 7 more sections

Figures (5)

Figure 1: Proposed simple neural-network baselines
Figure 2: Visual comparison: The gray shaded areas denote the ground truth anomalies. (a) UCR/IB-18 dataset with a series of sine waves added as anomaly. (b) UCR/IB-19 dataset with random numbers added as anomaly.
Figure 3: Point-wise F1 score as a function of the PCA dimension for the PCA Error method, evaluated on the SWAT and WADI_127 datasets.
Figure 4: Analysis of model agreement on the detected anomalies
Figure 5: Impact of sliding window size to generate univariate data representation on the two UCR dataset traces UCR/IB-17 and UCR/IB-18.

Position: Quo Vadis, Unsupervised Time Series Anomaly Detection?

TL;DR

Abstract

Position: Quo Vadis, Unsupervised Time Series Anomaly Detection?

Authors

TL;DR

Abstract

Table of Contents

Figures (5)