Video Anomaly Detection with Contours -- A Study
Mia Siemon, Ivan Nikolov, Thomas B. Moeslund, Kamal Nasrollahi
TL;DR
This study investigates contour-based pose-based video anomaly detection by learning normal motion patterns from 2D human contours rather than skeletons, enabling broader object categories while maintaining privacy and low computation with shallow networks. It introduces two contour representations (radii-based feature descriptor and shape contexts) and evaluates both regression and classification pipelines, including VAE/LAE/TAE/R-RNN and shape-clustering/classification with novelty detection. Evaluations on six datasets show that linear auto-encoders (LAE/TAE) often outperform variational baselines, with TAE delivering strong VAD results and PAD results surpassing prior art in several settings. The findings suggest contour-based approaches are a promising, privacy-friendly direction for VAD with potential for extending to multi-class contour-based analyses of other object categories.
Abstract
In Pose-based Video Anomaly Detection prior art is rooted on the assumption that abnormal events can be mostly regarded as a result of uncommon human behavior. Opposed to utilizing skeleton representations of humans, however, we investigate the potential of learning recurrent motion patterns of normal human behavior using 2D contours. Keeping all advantages of pose-based methods, such as increased object anonymization, the shift from human skeletons to contours is hypothesized to leave the opportunity to cover more object categories open for future research. We propose formulating the problem as a regression and a classification task, and additionally explore two distinct data representation techniques for contours. To further reduce the computational complexity of Pose-based Video Anomaly Detection solutions, all methods in this study are based on shallow Neural Networks from the field of Deep Learning, and evaluated on the three most prominent benchmark datasets within Video Anomaly Detection and their human-related counterparts, totaling six datasets. Our results indicate that this novel perspective on Pose-based Video Anomaly Detection marks a promising direction for future research.
