Table of Contents
Fetching ...

Weakly-Supervised Anomaly Detection in Surveillance Videos Based on Two-Stream I3D Convolution Network

Sareh Soltani Nejad, Anwar Haque

TL;DR

A more adaptable, efficient, and context-aware anomaly detection system, which is poised to redefine practices in urban surveillance by delivering a more adaptable, efficient, and context-aware Two-Stream Inflated 3D Convolutional Networks.

Abstract

The widespread implementation of urban surveillance systems has necessitated more sophisticated techniques for anomaly detection to ensure enhanced public safety. This paper presents a significant advancement in the field of anomaly detection through the application of Two-Stream Inflated 3D (I3D) Convolutional Networks. These networks substantially outperform traditional 3D Convolutional Networks (C3D) by more effectively extracting spatial and temporal features from surveillance videos, thus improving the precision of anomaly detection. Our research advances the field by implementing a weakly supervised learning framework based on Multiple Instance Learning (MIL), which uniquely conceptualizes surveillance videos as collections of 'bags' that contain instances (video clips). Each instance is innovatively processed through a ranking mechanism that prioritizes clips based on their potential to display anomalies. This novel strategy not only enhances the accuracy and precision of anomaly detection but also significantly diminishes the dependency on extensive manual annotations. Moreover, through meticulous optimization of model settings, including the choice of optimizer, our approach not only establishes new benchmarks in the performance of anomaly detection systems but also offers a scalable and efficient solution for real-world surveillance applications. This paper contributes significantly to the field of computer vision by delivering a more adaptable, efficient, and context-aware anomaly detection system, which is poised to redefine practices in urban surveillance.

Weakly-Supervised Anomaly Detection in Surveillance Videos Based on Two-Stream I3D Convolution Network

TL;DR

A more adaptable, efficient, and context-aware anomaly detection system, which is poised to redefine practices in urban surveillance by delivering a more adaptable, efficient, and context-aware Two-Stream Inflated 3D Convolutional Networks.

Abstract

The widespread implementation of urban surveillance systems has necessitated more sophisticated techniques for anomaly detection to ensure enhanced public safety. This paper presents a significant advancement in the field of anomaly detection through the application of Two-Stream Inflated 3D (I3D) Convolutional Networks. These networks substantially outperform traditional 3D Convolutional Networks (C3D) by more effectively extracting spatial and temporal features from surveillance videos, thus improving the precision of anomaly detection. Our research advances the field by implementing a weakly supervised learning framework based on Multiple Instance Learning (MIL), which uniquely conceptualizes surveillance videos as collections of 'bags' that contain instances (video clips). Each instance is innovatively processed through a ranking mechanism that prioritizes clips based on their potential to display anomalies. This novel strategy not only enhances the accuracy and precision of anomaly detection but also significantly diminishes the dependency on extensive manual annotations. Moreover, through meticulous optimization of model settings, including the choice of optimizer, our approach not only establishes new benchmarks in the performance of anomaly detection systems but also offers a scalable and efficient solution for real-world surveillance applications. This paper contributes significantly to the field of computer vision by delivering a more adaptable, efficient, and context-aware anomaly detection system, which is poised to redefine practices in urban surveillance.

Paper Structure

This paper contains 15 sections, 5 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: High-Level Proposed System Architecture
  • Figure 2: Architecture of the two-stream Inflated 3D Convolutional Neural Networks
  • Figure 3: The architecture of the proposed anomaly detection model
  • Figure 4: Comparative AUC results across different feature extraction models on the UCF-Crime dataset, illustrating the performance enhancements achieved through I3D RGB, I3D Flow, and two-stream network configurations.
  • Figure 5: ROC Curve of the proposed anomaly detection method, illustrating the trade-off between the True Positive Rate (TPR) and False Positive Rate (FPR) across various threshold settings.
  • ...and 3 more figures