Table of Contents
Fetching ...

Towards Adaptive Human-centric Video Anomaly Detection: A Comprehensive Framework and A New Benchmark

Armin Danesh Pazho, Shanle Yao, Ghazal Alinezhad Noghre, Babak Rahimi Ardabili, Vinit Katariya, Hamed Tabkhi

TL;DR

The paper tackles the challenge of robust human-centric video anomaly detection in open, real-world settings by proposing HuVAD, the largest continuously recorded, privacy-enhanced dataset, and UCAL, an unsupervised continual anomaly learning framework that enables per-environment adaptation. It defines a standard HuVAD-S benchmark and introduces HuVAD-C for continual learning, demonstrating that UCAL-augmented models achieve state-of-the-art performance in a majority of cases and significantly improve adaptation over static training. The contributions include a rigorous privacy-preserving annotation pipeline, diverse real-world scenes, a comprehensive multi-metric evaluation (AUC-ROC, AUC-PR, EER, 10ER), and a first continual learning benchmark for human-centric VAD, with practical impact for deploying adaptive, privacy-conscious surveillance systems.

Abstract

Human-centric Video Anomaly Detection (VAD) aims to identify human behaviors that deviate from normal. At its core, human-centric VAD faces substantial challenges, such as the complexity of diverse human behaviors, the rarity of anomalies, and ethical constraints. These challenges limit access to high-quality datasets and highlight the need for a dataset and framework supporting continual learning. Moving towards adaptive human-centric VAD, we introduce the HuVAD (Human-centric privacy-enhanced Video Anomaly Detection) dataset and a novel Unsupervised Continual Anomaly Learning (UCAL) framework. UCAL enables incremental learning, allowing models to adapt over time, bridging traditional training and real-world deployment. HuVAD prioritizes privacy by providing de-identified annotations and includes seven indoor/outdoor scenes, offering over 5x more pose-annotated frames than previous datasets. Our standard and continual benchmarks, utilize a comprehensive set of metrics, demonstrating that UCAL-enhanced models achieve superior performance in 82.14% of cases, setting a new state-of-the-art (SOTA). The dataset can be accessed at https://github.com/TeCSAR-UNCC/HuVAD.

Towards Adaptive Human-centric Video Anomaly Detection: A Comprehensive Framework and A New Benchmark

TL;DR

The paper tackles the challenge of robust human-centric video anomaly detection in open, real-world settings by proposing HuVAD, the largest continuously recorded, privacy-enhanced dataset, and UCAL, an unsupervised continual anomaly learning framework that enables per-environment adaptation. It defines a standard HuVAD-S benchmark and introduces HuVAD-C for continual learning, demonstrating that UCAL-augmented models achieve state-of-the-art performance in a majority of cases and significantly improve adaptation over static training. The contributions include a rigorous privacy-preserving annotation pipeline, diverse real-world scenes, a comprehensive multi-metric evaluation (AUC-ROC, AUC-PR, EER, 10ER), and a first continual learning benchmark for human-centric VAD, with practical impact for deploying adaptive, privacy-conscious surveillance systems.

Abstract

Human-centric Video Anomaly Detection (VAD) aims to identify human behaviors that deviate from normal. At its core, human-centric VAD faces substantial challenges, such as the complexity of diverse human behaviors, the rarity of anomalies, and ethical constraints. These challenges limit access to high-quality datasets and highlight the need for a dataset and framework supporting continual learning. Moving towards adaptive human-centric VAD, we introduce the HuVAD (Human-centric privacy-enhanced Video Anomaly Detection) dataset and a novel Unsupervised Continual Anomaly Learning (UCAL) framework. UCAL enables incremental learning, allowing models to adapt over time, bridging traditional training and real-world deployment. HuVAD prioritizes privacy by providing de-identified annotations and includes seven indoor/outdoor scenes, offering over 5x more pose-annotated frames than previous datasets. Our standard and continual benchmarks, utilize a comprehensive set of metrics, demonstrating that UCAL-enhanced models achieve superior performance in 82.14% of cases, setting a new state-of-the-art (SOTA). The dataset can be accessed at https://github.com/TeCSAR-UNCC/HuVAD.
Paper Structure (15 sections, 7 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: Sample of anomalies and their annotations in the new proposed benchmark: HuVAD dataset. Cropped for visualization purposes. Segmentation is solely used for demonstration purposes.
  • Figure 2: The camera views excluding people. The ratio has been adjusted to fit the manuscript.
  • Figure 3: Samples from CSC Camera. Segmentation is solely used for demonstration purposes.
  • Figure 4: Pose counts per camera across major multi-camera datasets (SHT liu2018future, CHAD danesh2023chad, and NWPUC Cao_2023_CVPR).
  • Figure 5: IoU and crowd density across key datasets (SHT liu2018future, IITB rodrigues2020multi, CHAD danesh2023chad, and NWPUC Cao_2023_CVPR) and HuVAD's camera views.
  • ...and 2 more figures