An Exploratory Study on Human-Centric Video Anomaly Detection through Variational Autoencoders and Trajectory Prediction

Ghazal Alinezhad Noghre; Armin Danesh Pazho; Hamed Tabkhi

An Exploratory Study on Human-Centric Video Anomaly Detection through Variational Autoencoders and Trajectory Prediction

Ghazal Alinezhad Noghre, Armin Danesh Pazho, Hamed Tabkhi

TL;DR

TSGAD is introduced, a novel human-centric Two-Stream Graph-Improved Anomaly Detection leveraging Variational Autoencoders (VAEs) and trajectory prediction that aims to explore the possibility of utilizing VAEs as a new approach for pose-based human-centric VAD alongside the benefits of trajectory prediction.

Abstract

Video Anomaly Detection (VAD) represents a challenging and prominent research task within computer vision. In recent years, Pose-based Video Anomaly Detection (PAD) has drawn considerable attention from the research community due to several inherent advantages over pixel-based approaches despite the occasional suboptimal performance. Specifically, PAD is characterized by reduced computational complexity, intrinsic privacy preservation, and the mitigation of concerns related to discrimination and bias against specific demographic groups. This paper introduces TSGAD, a novel human-centric Two-Stream Graph-Improved Anomaly Detection leveraging Variational Autoencoders (VAEs) and trajectory prediction. TSGAD aims to explore the possibility of utilizing VAEs as a new approach for pose-based human-centric VAD alongside the benefits of trajectory prediction. We demonstrate TSGAD's effectiveness through comprehensive experimentation on benchmark datasets. TSGAD demonstrates comparable results with state-of-the-art methods showcasing the potential of adopting variational autoencoders. This suggests a promising direction for future research endeavors. The code base for this work is available at https://github.com/TeCSAR-UNCC/TSGAD.

An Exploratory Study on Human-Centric Video Anomaly Detection through Variational Autoencoders and Trajectory Prediction

TL;DR

Abstract

Paper Structure (29 sections, 11 equations, 3 figures, 3 tables)

This paper contains 29 sections, 11 equations, 3 figures, 3 tables.

Introduction
Related Works
Pixel-based Approaches
Pose-based Approaches
Preliminaries
Variational Autoencoders
Trajectory Prediction
TSGAD
Problem Formulation
Archietcture
GA-VAE
Trajectory Prediction for Anomaly Detection
Experimental Setup
Datasets
ShanghaiTech Campus (SHT)
...and 14 more sections

Figures (3)

Figure 1: TSGAD architecture. The upper branch utilizes Graph Attentive Variational Autoencoder (GA-VAE) for learning the characteristics of normal human behavior distribution in an unsupervised manner. The lower branch leverages a SotA trajectory prediction method, namely Pishgu alinezhad2023pishgu, for learning how to predict normal trajectories. $P_t^i$ denotes the $i^{th}$ person at time $t$, and $D$, $\mu$, and $\sigma$ refer to the latent representation's dimensions, mean, and variance. $z$ follows a normal distribution with $z \sim (0, I)$, where $I$ is the identity matrix.
Figure 2: Nine layers of spatio-temporal graph convolution blocks are stacked forming the GA-VAE encoder. Each block consists of a spatial attention graph convolution followed by temporal convolution, batch normalization, a residual connection, and a final activation function.
Figure 3: The inference phase. The deviation from API in the latent space is used for calculating the pose score ($S_{Pose}$). The difference between the predicted trajectory and the actual trajectory measured by MSE is used to form a trajectory score ($S_{Traj}$). The weighted sum of these normalized scores forms the final anomaly score. $\mu_n$, $\sigma_n$, and API refer to the mean, and variance of the latent representation and Aggregated Parameter Index defined in \ref{['eq:api']} respectively.

An Exploratory Study on Human-Centric Video Anomaly Detection through Variational Autoencoders and Trajectory Prediction

TL;DR

Abstract

An Exploratory Study on Human-Centric Video Anomaly Detection through Variational Autoencoders and Trajectory Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (3)