Table of Contents
Fetching ...

VISTA: Unsupervised 2D Temporal Dependency Representations for Time Series Anomaly Detection

Sinchee Chin, Fan Zhang, Xiaochen Yang, Jing-Hao Xue, Wenming Yang, Peng Jia, Guijin Wang, Luo Yingqun

TL;DR

VISTA introduces a training-free approach to multivariate time series anomaly detection by integrating STL-based decomposition, a 2D Temporal Correlation Matrix through Temporal Self-Attention, and memory-efficient multivariate aggregation with a greedy coreset memory bank. The method preserves temporal structure across trend, seasonal, and residual components, enabling a visualizable representation that can be processed by pretrained CNNs for robust feature extraction. Empirical results across five public TSAD datasets show state-of-the-art performance in both F1 and ROC-AUC, with ablations confirming the effectiveness of the 2D representation, full decomposition, and Layer3/Layer4 features. The work offers practical deployment benefits due to its training-free nature and provides a bridge between TSAD and image-based anomaly detection through interpretable 2D representations.

Abstract

Time Series Anomaly Detection (TSAD) is essential for uncovering rare and potentially harmful events in unlabeled time series data. Existing methods are highly dependent on clean, high-quality inputs, making them susceptible to noise and real-world imperfections. Additionally, intricate temporal relationships in time series data are often inadequately captured in traditional 1D representations, leading to suboptimal modeling of dependencies. We introduce VISTA, a training-free, unsupervised TSAD algorithm designed to overcome these challenges. VISTA features three core modules: 1) Time Series Decomposition using Seasonal and Trend Decomposition via Loess (STL) to decompose noisy time series into trend, seasonal, and residual components; 2) Temporal Self-Attention, which transforms 1D time series into 2D temporal correlation matrices for richer dependency modeling and anomaly detection; and 3) Multivariate Temporal Aggregation, which uses a pretrained feature extractor to integrate cross-variable information into a unified, memory-efficient representation. VISTA's training-free approach enables rapid deployment and easy hyperparameter tuning, making it suitable for industrial applications. It achieves state-of-the-art performance on five multivariate TSAD benchmarks.

VISTA: Unsupervised 2D Temporal Dependency Representations for Time Series Anomaly Detection

TL;DR

VISTA introduces a training-free approach to multivariate time series anomaly detection by integrating STL-based decomposition, a 2D Temporal Correlation Matrix through Temporal Self-Attention, and memory-efficient multivariate aggregation with a greedy coreset memory bank. The method preserves temporal structure across trend, seasonal, and residual components, enabling a visualizable representation that can be processed by pretrained CNNs for robust feature extraction. Empirical results across five public TSAD datasets show state-of-the-art performance in both F1 and ROC-AUC, with ablations confirming the effectiveness of the 2D representation, full decomposition, and Layer3/Layer4 features. The work offers practical deployment benefits due to its training-free nature and provides a bridge between TSAD and image-based anomaly detection through interpretable 2D representations.

Abstract

Time Series Anomaly Detection (TSAD) is essential for uncovering rare and potentially harmful events in unlabeled time series data. Existing methods are highly dependent on clean, high-quality inputs, making them susceptible to noise and real-world imperfections. Additionally, intricate temporal relationships in time series data are often inadequately captured in traditional 1D representations, leading to suboptimal modeling of dependencies. We introduce VISTA, a training-free, unsupervised TSAD algorithm designed to overcome these challenges. VISTA features three core modules: 1) Time Series Decomposition using Seasonal and Trend Decomposition via Loess (STL) to decompose noisy time series into trend, seasonal, and residual components; 2) Temporal Self-Attention, which transforms 1D time series into 2D temporal correlation matrices for richer dependency modeling and anomaly detection; and 3) Multivariate Temporal Aggregation, which uses a pretrained feature extractor to integrate cross-variable information into a unified, memory-efficient representation. VISTA's training-free approach enables rapid deployment and easy hyperparameter tuning, making it suitable for industrial applications. It achieves state-of-the-art performance on five multivariate TSAD benchmarks.

Paper Structure

This paper contains 25 sections, 10 equations, 4 figures, 5 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overall architecture of VISTA. The proposed architecture processes long time series data in several stages. Initially, the data are partitioned into fixed-size windows of length $w_s$. Each window is then decomposed using STL decomposition into trend, seasonal, and residual components, each capturing distinct and significant aspects of the time series. The Temporal Self-Attention Module computes a temporal correlation matrix for each time series variables, enhancing interpretability of temporal dependencyfor each decomposed components. The temporal correlation matrices are further processed using a pretrained CNN network to extract intermediate features. Multivariate Temporal Aggregation Module is employed to aggregate features across each time series variable, yielding a unified feature representation. A greedy sampling algorithm is then applied to select the most representative features, which are stored in a memory bank for efficient retrieval. During inference, the memory bank constructed during training is used to compare query features via $L_2$-distance. To further enhance detection accuracy, the anomaly scores are rescaled using the method proposed in roth2022towards.
  • Figure 2: Qualitative comparison of anomaly detection results for TimesNet, MultiRC, and VISTA. The first row shows the raw time series data with anomalies highlighted in red. The second row depicts anomaly scores predicted by TimesNet, with the dotted horizontal line representing the threshold calculated by Optimal F1 and predicted anomalies highlighted in red. The third row shows anomaly scores from DCDetector. The final row presents VISTA’s patch-wise predictions, which result in square-like anomaly regions compared to previous methods. VISTA demonstrates the closest alignment with the ground truth.
  • Figure 3: Comparison between different 2D temporal representations across datasets. VISTA consistently outperforms other methods in the F1 Score.
  • Figure 4: Effect for Seasonal Ratio for STL Decomposition on ROC-AUC. VISTA consistently shows minimal reliance on seasonal decomposition period for all 5 datasets.