Labels Matter More Than Models: Quantifying the Benefit of Supervised Time Series Anomaly Detection
Zhijie Zhong, Zhiwen Yu, Kaixiang Yang, C. L. Philip Chen
TL;DR
The paper tackles time-series anomaly detection under limited labeling, arguing that model complexity is less critical than leveraging anomaly labels. It introduces STAND, a simple supervised baseline, and conducts the first dedicated benchmark comparing supervised versus unsupervised TSAD methods. Across five datasets, results show that even minimal supervision yields substantial gains over state-of-the-art unsupervised approaches, with STAND delivering better prediction consistency and anomaly localization. The work advocates a data-centric shift in TSAD research and provides open-source code to support broader evaluation and deployment.
Abstract
Time series anomaly detection (TSAD) is a critical data mining task often constrained by label scarcity. Consequently, current research predominantly focuses on Unsupervised Time-series Anomaly Detection (UTAD), relying on complex architectures to model normal data distributions. However, this approach often overlooks the significant performance gains available from limited anomaly labels achievable in practical scenarios. This paper challenges the premise that architectural complexity is the optimal path for TSAD. We conduct the first methodical comparison between supervised and unsupervised paradigms and introduce STAND, a streamlined supervised baseline. Extensive experiments on five public datasets demonstrate that: (1) Labels matter more than models: under a limited labeling budget, simple supervised models significantly outperform complex state-of-the-art unsupervised methods; (2) Supervision yields higher returns: the performance gain from minimal supervision far exceeds that from architectural innovations; and (3) Practicality: STAND exhibits superior prediction consistency and anomaly localization compared to unsupervised counterparts. These findings advocate for a data-centric shift in TSAD research, emphasizing label utilization over purely algorithmic complexity. The code is publicly available at https://github.com/EmorZz1G/STAND.
