Table of Contents
Fetching ...

Trajectory Anomaly Detection with Language Models

Jonathan Mbuya, Dieter Pfoser, Antonios Anastasopoulos

TL;DR

The LM-TAD framework supports various trajectory representations, including GPS coordinates, staypoints, and activity types, proving its versatility in handling diverse trajectory data and significantly reducing computational latency by caching key-value states of the attention mechanism, thereby avoiding repeated computations.

Abstract

This paper presents a novel approach for trajectory anomaly detection using an autoregressive causal-attention model, termed LM-TAD. This method leverages the similarities between language statements and trajectories, both of which consist of ordered elements requiring coherence through external rules and contextual variations. By treating trajectories as sequences of tokens, our model learns the probability distributions over trajectories, enabling the identification of anomalous locations with high precision. We incorporate user-specific tokens to account for individual behavior patterns, enhancing anomaly detection tailored to user context. Our experiments demonstrate the effectiveness of LM-TAD on both synthetic and real-world datasets. In particular, the model outperforms existing methods on the Pattern of Life (PoL) dataset by detecting user-contextual anomalies and achieves competitive results on the Porto taxi dataset, highlighting its adaptability and robustness. Additionally, we introduce the use of perplexity and surprisal rate metrics for detecting outliers and pinpointing specific anomalous locations within trajectories. The LM-TAD framework supports various trajectory representations, including GPS coordinates, staypoints, and activity types, proving its versatility in handling diverse trajectory data. Moreover, our approach is well-suited for online trajectory anomaly detection, significantly reducing computational latency by caching key-value states of the attention mechanism, thereby avoiding repeated computations.

Trajectory Anomaly Detection with Language Models

TL;DR

The LM-TAD framework supports various trajectory representations, including GPS coordinates, staypoints, and activity types, proving its versatility in handling diverse trajectory data and significantly reducing computational latency by caching key-value states of the attention mechanism, thereby avoiding repeated computations.

Abstract

This paper presents a novel approach for trajectory anomaly detection using an autoregressive causal-attention model, termed LM-TAD. This method leverages the similarities between language statements and trajectories, both of which consist of ordered elements requiring coherence through external rules and contextual variations. By treating trajectories as sequences of tokens, our model learns the probability distributions over trajectories, enabling the identification of anomalous locations with high precision. We incorporate user-specific tokens to account for individual behavior patterns, enhancing anomaly detection tailored to user context. Our experiments demonstrate the effectiveness of LM-TAD on both synthetic and real-world datasets. In particular, the model outperforms existing methods on the Pattern of Life (PoL) dataset by detecting user-contextual anomalies and achieves competitive results on the Porto taxi dataset, highlighting its adaptability and robustness. Additionally, we introduce the use of perplexity and surprisal rate metrics for detecting outliers and pinpointing specific anomalous locations within trajectories. The LM-TAD framework supports various trajectory representations, including GPS coordinates, staypoints, and activity types, proving its versatility in handling diverse trajectory data. Moreover, our approach is well-suited for online trajectory anomaly detection, significantly reducing computational latency by caching key-value states of the attention mechanism, thereby avoiding repeated computations.
Paper Structure (23 sections, 8 equations, 7 figures, 3 tables)

This paper contains 23 sections, 8 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: A conceptual visualization of trajectories as natural language statements. Language statements and trajectories share similarities: both consist of ordered elements from a finite set (words vs. GPS points) and require connections by semantic or spatiotemporal relationships to be coherent. They are governed by external rules (grammar for the language, road networks for trajectories) and vary by user or context (writing style vs. movement behavior).
  • Figure 2: Architecture of LM-TAD, our trajectory model.
  • Figure 3: Example location configurations. Locations can be (a) discretized GPS coordinates, (b) staypoints, (c) staypoints enhanced with dwell time, or (d) activities.
  • Figure 4: Example of generated anomalies with $\alpha=0.3$ and $\beta = 3$ for both types of anomalies on the Porto dataset.
  • Figure 5: Anomaly results for all methods trained on all the pattern of life dataset trajectories. Each dot represents the perplexity of a trajectory for any of the ten agents with normal and anomalous trajectories. Unlike other methods, our method (d) distinguishes between anomalous trajectories and normal trajectories by scoring most anomalous trajectories with high perplexity.
  • ...and 2 more figures