Table of Contents
Fetching ...

Uncertainty-aware Human Mobility Modeling and Anomaly Detection

Haomin Wen, Shurui Cao, Zeeshan Rasheed, Khurram Hassan Shafique, Leman Akoglu

TL;DR

The paper tackles unsupervised anomaly detection in complex human mobility by framing GPS data as irregular stay-point events connected by trips, and by modeling both aleatoric and epistemic uncertainty within a Dual Transformer architecture. USTAD tokenizes events into feature tokens, applies feature- and event-level attention, and uses an uncertainty-aware decoder to enable robust training; anomaly scoring combines AU-attenuated prediction loss with a kNN-based out-of-distribution measure while excluding epistemic uncertainty from scoring. Empirically, USTAD achieves 3-15% AUROC gains over strong baselines on industry-scale datasets and provides insights into the distinct roles of uncertainty types in prediction and anomaly detection. The work demonstrates robust, unlabeled anomaly detection in heterogeneous, real-world mobility data and offers broad applicability to other domains involving uncertain user behavior sequences.

Abstract

Given the temporal GPS coordinates from a large set of human agents, how can we model their mobility behavior toward effective anomaly (e.g. bad-actor or malicious behavior) detection without any labeled data? Human mobility and trajectory modeling have been extensively studied, showcasing varying abilities to manage complex inputs and balance performance-efficiency trade-offs. In this work, we formulate anomaly detection in complex human behavior by modeling raw GPS data as a sequence of stay-point events, each characterized by spatio-temporal features, along with trips (i.e. commute) between the stay-points. Our problem formulation allows us to leverage modern sequence models for unsupervised training and anomaly detection. Notably, we equip our proposed model USTAD (for Uncertainty-aware Spatio-Temporal Anomaly Detection) with aleatoric (i.e. data) uncertainty estimation to account for inherent stochasticity in certain individuals' behavior, as well as epistemic (i.e. model) uncertainty to handle data sparsity under a large variety of human behaviors. Together, aleatoric and epistemic uncertainties unlock a robust loss function as well as uncertainty-aware decision-making in anomaly scoring. Extensive experiments shows that USTAD improves anomaly detection AUCROC by 3\%-15\% over baselines in industry-scale data.

Uncertainty-aware Human Mobility Modeling and Anomaly Detection

TL;DR

The paper tackles unsupervised anomaly detection in complex human mobility by framing GPS data as irregular stay-point events connected by trips, and by modeling both aleatoric and epistemic uncertainty within a Dual Transformer architecture. USTAD tokenizes events into feature tokens, applies feature- and event-level attention, and uses an uncertainty-aware decoder to enable robust training; anomaly scoring combines AU-attenuated prediction loss with a kNN-based out-of-distribution measure while excluding epistemic uncertainty from scoring. Empirically, USTAD achieves 3-15% AUROC gains over strong baselines on industry-scale datasets and provides insights into the distinct roles of uncertainty types in prediction and anomaly detection. The work demonstrates robust, unlabeled anomaly detection in heterogeneous, real-world mobility data and offers broad applicability to other domains involving uncertain user behavior sequences.

Abstract

Given the temporal GPS coordinates from a large set of human agents, how can we model their mobility behavior toward effective anomaly (e.g. bad-actor or malicious behavior) detection without any labeled data? Human mobility and trajectory modeling have been extensively studied, showcasing varying abilities to manage complex inputs and balance performance-efficiency trade-offs. In this work, we formulate anomaly detection in complex human behavior by modeling raw GPS data as a sequence of stay-point events, each characterized by spatio-temporal features, along with trips (i.e. commute) between the stay-points. Our problem formulation allows us to leverage modern sequence models for unsupervised training and anomaly detection. Notably, we equip our proposed model USTAD (for Uncertainty-aware Spatio-Temporal Anomaly Detection) with aleatoric (i.e. data) uncertainty estimation to account for inherent stochasticity in certain individuals' behavior, as well as epistemic (i.e. model) uncertainty to handle data sparsity under a large variety of human behaviors. Together, aleatoric and epistemic uncertainties unlock a robust loss function as well as uncertainty-aware decision-making in anomaly scoring. Extensive experiments shows that USTAD improves anomaly detection AUCROC by 3\%-15\% over baselines in industry-scale data.
Paper Structure (42 sections, 18 equations, 16 figures, 7 tables)

This paper contains 42 sections, 18 equations, 16 figures, 7 tables.

Figures (16)

  • Figure 1: Spatiotemporal event sequence with various features (a.k.a. markers) per event.
  • Figure 2: Proposed model architecture for uncertainty-aware human mobility behavior modeling. Raw GPS data is represented as a (ordered) sequence of stay-point events each with a (unordered) sequence of spatiotemporal markers. Dual-Transformer models each individual's sequence-of-sequences via both feature- and event-level attention, as well as uncertainty estimation that simultaneously enables robust training and informs anomaly scoring at inference.
  • Figure 3: Loss (AU-attenuated PE) vs. EU for features x (left) and POI (right). Color depicts log. of event counts w/ most points near origin. Two regions of interest are: $i$) predictable anomalies w/ low-EU&high loss; $ii$) OOD events w/ high-EU.
  • Figure 4: Relationship btwn. MAE (accuracy) & uncertainty follow an increasing (decreasing) trend. The uncertainty estimates are well aligned with prediction performance.
  • Figure 5: Relationship between EU and accuracy of each POI type (only a few are annotated for better illustration). USTAD accurately captures uncertainty: POI prediction performance tends to increase when EU is lower. The size of each point/POI is proportional to its frequency in the dataset.
  • ...and 11 more figures