Table of Contents
Fetching ...

Matrix Profile for Time-Series Anomaly Detection: A Reproducible Open-Source Benchmark on TSB-AD

Chin-Chia Michael Yeh

Abstract

Matrix Profile (MP) methods are an interpretable and scalable family of distance-based methods for time-series anomaly detection, but strong benchmark performance still depends on design choices beyond a vanilla nearest-neighbor profile. This technical report documents an open-source Matrix Profile for Anomaly Detection (MMPAD) submission to TSB-AD, a benchmark that covers both univariate and multivariate time series. The submitted system combines pre-sorted multidimensional aggregation, efficient exclusion-zone-aware k-nearest-neighbor (kNN) retrieval for repeated anomalies, and moving-average post-processing. To serve as a reproducible reference for MP-based anomaly detection on TSB-AD, we detail the released implementation, the hyperparameter settings for the univariate and multivariate tracks, and the corresponding benchmark results. We further analyze how the system performs on the aggregate leaderboard and across specific dataset characteristics.The open-source implementation is available at https://github.com/mcyeh/mmpad_tsb.

Matrix Profile for Time-Series Anomaly Detection: A Reproducible Open-Source Benchmark on TSB-AD

Abstract

Matrix Profile (MP) methods are an interpretable and scalable family of distance-based methods for time-series anomaly detection, but strong benchmark performance still depends on design choices beyond a vanilla nearest-neighbor profile. This technical report documents an open-source Matrix Profile for Anomaly Detection (MMPAD) submission to TSB-AD, a benchmark that covers both univariate and multivariate time series. The submitted system combines pre-sorted multidimensional aggregation, efficient exclusion-zone-aware k-nearest-neighbor (kNN) retrieval for repeated anomalies, and moving-average post-processing. To serve as a reproducible reference for MP-based anomaly detection on TSB-AD, we detail the released implementation, the hyperparameter settings for the univariate and multivariate tracks, and the corresponding benchmark results. We further analyze how the system performs on the aggregate leaderboard and across specific dataset characteristics.The open-source implementation is available at https://github.com/mcyeh/mmpad_tsb.

Paper Structure

This paper contains 14 sections, 1 equation, 3 figures, 3 tables, 2 algorithms.

Figures (3)

  • Figure 1: Toy illustration of the K-of-N anomaly problem. The time series contains an anomaly in only one of the eight dimensions. The naive all-dimension matrix profile reaches its maximum away from the anomalous interval, whereas the pre-sorting matrix profile reaches its maximum near that interval. The orange X marks the maximum value of each profile.
  • Figure 2: The pre-sorting strategy sorts dimension-wise pairwise distances before nearest-neighbor selection in the multidimensional matrix profile. This ordering lets the sorted dimension-wise distances influence neighbor selection itself, which helps preserve anomalies that appear in only a subset of dimensions.
  • Figure 3: The post-sorting strategy sorts dimension-wise matrix-profile values after nearest-neighbor selection. As a result, dimensional ordering is applied only after the neighbors have already been chosen.