GADformer: A Transparent Transformer Model for Group Anomaly Detection on Trajectories
Andreas Lohrer, Darpan Malik, Claudius Zelenka, Peer Kröger
TL;DR
GADFormer presents a transparent, BERT-based transformer model for Group Anomaly Detection on trajectories, enabling unsupervised and semi-supervised learning with attention-driven representations of trajectory segments. It introduces a Block Attention-anomaly Score (BAS) to inspect how attention patterns separate normal from abnormal groups, enhancing model transparency. Across synthetic and three real-world trajectory datasets, GADFormer achieves competitive or superior AUROC and often higher AUPRC compared with GRU-based and MTGAD baselines, while remaining robust to noise and novelty. The work demonstrates the feasibility of using a transformer-encoder for GAD on coordinate-based trajectories and opens avenues for cross-domain extensions and improved probability calibration.
Abstract
Group Anomaly Detection (GAD) identifies unusual pattern in groups where individual members might not be anomalous. This task is of major importance across multiple disciplines, in which also sequences like trajectories can be considered as a group. As groups become more diverse in heterogeneity and size, detecting group anomalies becomes challenging, especially without supervision. Though Recurrent Neural Networks are well established deep sequence models, their performance can decrease with increasing sequence lengths. Hence, this paper introduces GADformer, a BERT-based model for attention-driven GAD on trajectories in unsupervised and semi-supervised settings. We demonstrate how group anomalies can be detected by attention-based GAD. We also introduce the Block-Attention-anomaly-Score (BAS) to enhance model transparency by scoring attention patterns. In addition to that, synthetic trajectory generation allows various ablation studies. In extensive experiments we investigate our approach versus related works in their robustness for trajectory noise and novelties on synthetic data and three real world datasets.
