Table of Contents
Fetching ...

Open Challenges in Time Series Anomaly Detection: An Industry Perspective

Andreas Mueller

TL;DR

This paper argues that practical time-series anomaly detection (TAD) in industry differs substantially from academic benchmarks, centering on two core tenets: alerting and application-specific needs. It advocates for formalizing TAD problems, integrating streaming, side information, and human-in-the-loop feedback, and accounting for RCA and signal-processing considerations through a holistic framework. By combining an illustrative temperature-sensor use-case with analyses of preprocessing, thresholding, and evaluation, the authors highlight gaps in current benchmarks and propose directions for cohorts, conditional anomalies, and online learning with censored feedback. The work stresses the importance of realistic datasets, integrated evaluation, and end-to-end system design to enable reliable, actionable alerts in large-scale cloud environments, with practical impact on reliability, maintenance, and operations.

Abstract

Current research in time-series anomaly detection is using definitions that miss critical aspects of how anomaly detection is commonly used in practice. We list several areas that are of practical relevance and that we believe are either under-investigated or missing entirely from the current discourse. Based on an investigation of systems deployed in a cloud environment, we motivate the areas of streaming algorithms, human-in-the-loop scenarios, point processes, conditional anomalies and populations analysis of time series. This paper serves as a motivation and call for action, including opportunities for theoretical and applied research, as well as for building new dataset and benchmarks.

Open Challenges in Time Series Anomaly Detection: An Industry Perspective

TL;DR

This paper argues that practical time-series anomaly detection (TAD) in industry differs substantially from academic benchmarks, centering on two core tenets: alerting and application-specific needs. It advocates for formalizing TAD problems, integrating streaming, side information, and human-in-the-loop feedback, and accounting for RCA and signal-processing considerations through a holistic framework. By combining an illustrative temperature-sensor use-case with analyses of preprocessing, thresholding, and evaluation, the authors highlight gaps in current benchmarks and propose directions for cohorts, conditional anomalies, and online learning with censored feedback. The work stresses the importance of realistic datasets, integrated evaluation, and end-to-end system design to enable reliable, actionable alerts in large-scale cloud environments, with practical impact on reliability, maintenance, and operations.

Abstract

Current research in time-series anomaly detection is using definitions that miss critical aspects of how anomaly detection is commonly used in practice. We list several areas that are of practical relevance and that we believe are either under-investigated or missing entirely from the current discourse. Based on an investigation of systems deployed in a cloud environment, we motivate the areas of streaming algorithms, human-in-the-loop scenarios, point processes, conditional anomalies and populations analysis of time series. This paper serves as a motivation and call for action, including opportunities for theoretical and applied research, as well as for building new dataset and benchmarks.

Paper Structure

This paper contains 28 sections, 10 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Temperature data for an ultra-low temperature freezer (Unit Haier 810545 from huang2023labelled. Anomaly scores are computed with spectral residual ren2019time and thresholded at the 99.9th percentile. The two detected anomalies have very different patterns and real-life consequences.
  • Figure 2: A synthetic time series (top), on which the periodicity can easily be read from the autocorrelation function (bottom), but on which current methods fail, see markers on the autocorrelation function.
  • Figure 3: New York City citibike rental data from fall of 2015 for two different stations. The data is present as a point process and needs to be resampled to a time series for anomaly detection. Different downsampling rates might be appropriate for different stations, and it's non-obivous how to determine them automatically.
  • Figure 4: Three synthetic time series, together with their autocorrelation function. Detection results and ground truth are labeled in the ACF.