Table of Contents
Fetching ...

OIPR: Evaluation for Time-series Anomaly Detection Inspired by Operator Interest

Yuhan Jing, Jingyu Wang, Lei Zhang, Haifeng Sun, Bo He, Zirui Zhuang, Chengsen Wang, Qi Qi, Jianxin Liao

TL;DR

This work introduces OIPR, an operator-interest-based, area-under-curve evaluator for time-series anomaly detection that balances the detection of long, continuous anomalies and numerous short events. It constructs an operator-interest curve to model how operators respond to detector alarms across discovery, duration, and observation phases, and computes precision/recall as areas between ground-truth and predicted interest curves, enabling fragment merging and existence-reward. Through a specially designed scenario dataset and five real-world datasets, OIPR demonstrates robustness to extreme cases and provides more reliable detector rankings than traditional point-based or event-based evaluators. The approach subsumes PW and event-based evaluators as special configurations, offering a unified, practical framework for evaluating TAD detectors in diverse settings.

Abstract

With the growing adoption of time-series anomaly detection (TAD) technology, numerous studies have employed deep learning-based detectors to analyze time-series data in the fields of Internet services, industrial systems, and sensors. The selection and optimization of anomaly detectors strongly rely on the availability of an effective evaluation for TAD performance. Since anomalies in time-series data often manifest as a sequence of points, conventional metrics that solely consider the detection of individual points are inadequate. Existing TAD evaluators typically employ point-based or event-based metrics to capture the temporal context. However, point-based evaluators tend to overestimate detectors that excel only in detecting long anomalies, while event-based evaluators are susceptible to being misled by fragmented detection results. To address these limitations, we propose OIPR (Operator Interest-based Precision and Recall metrics), a novel TAD evaluator with area-based metrics. It models the process of operators receiving detector alarms and handling anomalies, utilizing area under the operator interest curve to evaluate TAD performance. Furthermore, we build a special scenario dataset to compare the characteristics of different evaluators. Through experiments conducted on the special scenario dataset and five real-world datasets, we demonstrate the remarkable performance of OIPR in extreme and complex scenarios. It achieves a balance between point and event perspectives, overcoming their primary limitations and offering applicability to broader situations.

OIPR: Evaluation for Time-series Anomaly Detection Inspired by Operator Interest

TL;DR

This work introduces OIPR, an operator-interest-based, area-under-curve evaluator for time-series anomaly detection that balances the detection of long, continuous anomalies and numerous short events. It constructs an operator-interest curve to model how operators respond to detector alarms across discovery, duration, and observation phases, and computes precision/recall as areas between ground-truth and predicted interest curves, enabling fragment merging and existence-reward. Through a specially designed scenario dataset and five real-world datasets, OIPR demonstrates robustness to extreme cases and provides more reliable detector rankings than traditional point-based or event-based evaluators. The approach subsumes PW and event-based evaluators as special configurations, offering a unified, practical framework for evaluating TAD detectors in diverse settings.

Abstract

With the growing adoption of time-series anomaly detection (TAD) technology, numerous studies have employed deep learning-based detectors to analyze time-series data in the fields of Internet services, industrial systems, and sensors. The selection and optimization of anomaly detectors strongly rely on the availability of an effective evaluation for TAD performance. Since anomalies in time-series data often manifest as a sequence of points, conventional metrics that solely consider the detection of individual points are inadequate. Existing TAD evaluators typically employ point-based or event-based metrics to capture the temporal context. However, point-based evaluators tend to overestimate detectors that excel only in detecting long anomalies, while event-based evaluators are susceptible to being misled by fragmented detection results. To address these limitations, we propose OIPR (Operator Interest-based Precision and Recall metrics), a novel TAD evaluator with area-based metrics. It models the process of operators receiving detector alarms and handling anomalies, utilizing area under the operator interest curve to evaluate TAD performance. Furthermore, we build a special scenario dataset to compare the characteristics of different evaluators. Through experiments conducted on the special scenario dataset and five real-world datasets, we demonstrate the remarkable performance of OIPR in extreme and complex scenarios. It achieves a balance between point and event perspectives, overcoming their primary limitations and offering applicability to broader situations.

Paper Structure

This paper contains 28 sections, 16 equations, 15 figures, 12 tables, 1 algorithm.

Figures (15)

  • Figure 1: An example highlighting the distinction between point-based and event-based perspectives, comparing three different detectors $d_1$, $d_2$, and $d_3$ for the same ground truth.
  • Figure 2: A demonstration scenario which displays 5 adversary detectors, including the first point detector $d_{fp}$, the long anomaly detector $d_l$, the dispersed disturbance detector $d_{disp}$, the aggregated disturbance detector $d_{aggr}$, and the continuous disturbance detector $d_{cont}$.
  • Figure 3: An example of the operator interest curve for an individual continuous anomaly event.
  • Figure 4: An example of the operator interest curve for fragmented anomaly events.
  • Figure 5: Visualization of the overlapping area of operator curves, corresponding to $TP_{oi}$, $FP_{oi}$, and $FN_{oi}$.
  • ...and 10 more figures