Table of Contents
Fetching ...

Back to Bayesics: Uncovering Human Mobility Distributions and Anomalies with an Integrated Statistical and Neural Framework

Minxuan Duan, Yinlong Qian, Lingyi Zhao, Zihao Zhou, Zeeshan Rasheed, Rose Yu, Khurram Shafique

TL;DR

DeepBayesic is a novel framework that integrates Bayesian principles with deep neural networks to model the underlying multivariate distributions from sparse and complex datasets to provide a more comprehensive understanding of mobility patterns.

Abstract

Existing methods for anomaly detection often fall short due to their inability to handle the complexity, heterogeneity, and high dimensionality inherent in real-world mobility data. In this paper, we propose DeepBayesic, a novel framework that integrates Bayesian principles with deep neural networks to model the underlying multivariate distributions from sparse and complex datasets. Unlike traditional models, DeepBayesic is designed to manage heterogeneous inputs, accommodating both continuous and categorical data to provide a more comprehensive understanding of mobility patterns. The framework features customized neural density estimators and hybrid architectures, allowing for flexibility in modeling diverse feature distributions and enabling the use of specialized neural networks tailored to different data types. Our approach also leverages agent embeddings for personalized anomaly detection, enhancing its ability to distinguish between normal and anomalous behaviors for individual agents. We evaluate our approach on several mobility datasets, demonstrating significant improvements over state-of-the-art anomaly detection methods. Our results indicate that incorporating personalization and advanced sequence modeling techniques can substantially enhance the ability to detect subtle and complex anomalies in spatiotemporal event sequences.

Back to Bayesics: Uncovering Human Mobility Distributions and Anomalies with an Integrated Statistical and Neural Framework

TL;DR

DeepBayesic is a novel framework that integrates Bayesian principles with deep neural networks to model the underlying multivariate distributions from sparse and complex datasets to provide a more comprehensive understanding of mobility patterns.

Abstract

Existing methods for anomaly detection often fall short due to their inability to handle the complexity, heterogeneity, and high dimensionality inherent in real-world mobility data. In this paper, we propose DeepBayesic, a novel framework that integrates Bayesian principles with deep neural networks to model the underlying multivariate distributions from sparse and complex datasets. Unlike traditional models, DeepBayesic is designed to manage heterogeneous inputs, accommodating both continuous and categorical data to provide a more comprehensive understanding of mobility patterns. The framework features customized neural density estimators and hybrid architectures, allowing for flexibility in modeling diverse feature distributions and enabling the use of specialized neural networks tailored to different data types. Our approach also leverages agent embeddings for personalized anomaly detection, enhancing its ability to distinguish between normal and anomalous behaviors for individual agents. We evaluate our approach on several mobility datasets, demonstrating significant improvements over state-of-the-art anomaly detection methods. Our results indicate that incorporating personalization and advanced sequence modeling techniques can substantially enhance the ability to detect subtle and complex anomalies in spatiotemporal event sequences.
Paper Structure (29 sections, 15 equations, 4 figures, 3 tables)

This paper contains 29 sections, 15 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Agent embedding auto-encoder. A Transformer Encoder is trained to project an input feature sequence $\boldsymbol{X}$, encoded by a Sequence Encoder and prefixed by learnable token $E_{0}$, into its latent representation $h$. Simultaneously, a Transformer Decoder is trained to reconstruct the encoded sequence from the latent representation $h$ and the standard positional encoding $\boldsymbol{PE}$.
  • Figure 2: DeepBayesic pipeline. The agent embedding $h$ is fed into three modules: Arrival Time Estimation, POI Type Estimation, and Stay Duration Estimation. Each module uses $h$ along with other relevant inputs to estimate the corresponding conditional probability distributions, which are then integrated into a joint probability model. Finally, the agent embedding $h$ and an observation $x$ are input into the joint probability model to calculate anomaly score $s$.
  • Figure 3: ROC curves for anomaly detection performance across different datasets and levels of granularity. The top row (a-c) shows the ROC curves for agent-level anomaly detection, while the bottom row (d-f) shows the ROC curves for staypoint-level anomaly detection. Our method, DeepBayesic, is represented in red.
  • Figure 4: Visualization of (a) the predicted arrival time distribution across multiple agents and (b) the predicted duration distribution conditioned on agent embedding, arrival time, and POI types (blue for school, orange for recreation) for a student agent and a non-student agent.