Table of Contents
Fetching ...

GDformer: Going Beyond Subsequence Isolation for Multivariate Time Series Anomaly Detection

Qingxiang Liu, Chenghao Liu, Sheng Sun, Di Yao, Yuxuan Liang

TL;DR

GDformer tackles unsupervised anomaly detection in multivariate time series by learning global normal representations through a compact global dictionary of Key/Value vectors and prototype-guided similarity, moving beyond horizon-limited subsequence analysis. It replaces standard self-attention with dictionary-based cross-attention to capture global correlations and uses prototypes to define a compact similarity-based decision boundary. Training combines a reconstruction objective with a similarity loss against prototypes, producing per-point anomaly scores via aggregated cross-attention similarity. The approach achieves state-of-the-art results on five real-world benchmarks and demonstrates transferability of the global dictionary across datasets, suggesting strong potential for foundation-model-style anomaly detection in time series.

Abstract

Unsupervised anomaly detection of multivariate time series is a challenging task, given the requirements of deriving a compact detection criterion without accessing the anomaly points. The existing methods are mainly based on reconstruction error or association divergence, which are both confined to isolated subsequences with limited horizons, hardly promising unified series-level criterion. In this paper, we propose the Global Dictionary-enhanced Transformer (GDformer) with a renovated dictionary-based cross attention mechanism to cultivate the global representations shared by all normal points in the entire series. Accordingly, the cross-attention maps reflect the correlation weights between the point and global representations, which naturally leads to the representation-wise similarity-based detection criterion. To foster more compact detection boundary, prototypes are introduced to capture the distribution of normal point-global correlation weights. GDformer consistently achieves state-of-the-art unsupervised anomaly detection performance on five real-world benchmark datasets. Further experiments validate the global dictionary has great transferability among various datasets. The code is available at https://github.com/yuppielqx/GDformer.

GDformer: Going Beyond Subsequence Isolation for Multivariate Time Series Anomaly Detection

TL;DR

GDformer tackles unsupervised anomaly detection in multivariate time series by learning global normal representations through a compact global dictionary of Key/Value vectors and prototype-guided similarity, moving beyond horizon-limited subsequence analysis. It replaces standard self-attention with dictionary-based cross-attention to capture global correlations and uses prototypes to define a compact similarity-based decision boundary. Training combines a reconstruction objective with a similarity loss against prototypes, producing per-point anomaly scores via aggregated cross-attention similarity. The approach achieves state-of-the-art results on five real-world benchmarks and demonstrates transferability of the global dictionary across datasets, suggesting strong potential for foundation-model-style anomaly detection in time series.

Abstract

Unsupervised anomaly detection of multivariate time series is a challenging task, given the requirements of deriving a compact detection criterion without accessing the anomaly points. The existing methods are mainly based on reconstruction error or association divergence, which are both confined to isolated subsequences with limited horizons, hardly promising unified series-level criterion. In this paper, we propose the Global Dictionary-enhanced Transformer (GDformer) with a renovated dictionary-based cross attention mechanism to cultivate the global representations shared by all normal points in the entire series. Accordingly, the cross-attention maps reflect the correlation weights between the point and global representations, which naturally leads to the representation-wise similarity-based detection criterion. To foster more compact detection boundary, prototypes are introduced to capture the distribution of normal point-global correlation weights. GDformer consistently achieves state-of-the-art unsupervised anomaly detection performance on five real-world benchmark datasets. Further experiments validate the global dictionary has great transferability among various datasets. The code is available at https://github.com/yuppielqx/GDformer.

Paper Structure

This paper contains 25 sections, 12 equations, 7 figures, 9 tables, 1 algorithm.

Figures (7)

  • Figure 1: How to derive the detection criterion. Left: AnomalyTrans and DCdctector learn intra-subsequence point-wise association and derive the detection criterion by combining subsequence-level anomaly scores. Right: Our proposal cultivates global normal representations manifested as dictionary and prototypes for evaluating similarity discrepancy to provide series-level criterion.
  • Figure 2: Anomaly scores v.s. detection criterion for different subsequences in AnomalyTrans.
  • Figure 3: The framework of $\mathtt{GDformer}$ (left). In Dictionary-Based Cross Attention (right), the global dictionary of Key and Value vectors (in cross-attention module) learns global representations shared by all normal points in the entire series. The prototypes (in similarity evaluation module) capture the normal distribution of cross-attention weights.
  • Figure 4: Model efficiency comparison. Larger bubble size indicates higher memory requirements.
  • Figure 5: Detection results visualization of AnomalyTrans, DCdetector, and $\mathtt{GDformer}$ on PSM dataset. The point and segment anomalies are marked in red circles and red segments. We plot the detection scores the corresponding detection criterion (red dashed lines) for various methods. FP (false positive), FN (false negative) and true positive are highlighted in blue, yellow and red respectively.
  • ...and 2 more figures