Table of Contents
Fetching ...

Detection of Object Throwing Behavior in Surveillance Videos

Ivo P. C. Kersten, Erkut Akdag, Egor Bondarev, Peter H. N. De With

TL;DR

The performance of the anomaly detection algorithm is improved by applying the Adam optimizer instead of Adadelta, and proposing a mean normal loss function that covers the multitude of normal situations in traffic, which yields better anomaly detection performance.

Abstract

Anomalous behavior detection is a challenging research area within computer vision. Progress in this area enables automated detection of dangerous behavior using surveillance camera feeds. A dangerous behavior that is often overlooked in other research is the throwing action in traffic flow, which is one of the unique requirements of our Smart City project to enhance public safety. This paper proposes a solution for throwing action detection in surveillance videos using deep learning. At present, datasets for throwing actions are not publicly available. To address the use-case of our Smart City project, we first generate the novel public 'Throwing Action' dataset, consisting of 271 videos of throwing actions performed by traffic participants, such as pedestrians, bicyclists, and car drivers, and 130 normal videos without throwing actions. Second, we compare the performance of different feature extractors for our anomaly detection method on the UCF-Crime and Throwing-Action datasets. The explored feature extractors are the Convolutional 3D (C3D) network, the Inflated 3D ConvNet (I3D) network, and the Multi-Fiber Network (MFNet). Finally, the performance of the anomaly detection algorithm is improved by applying the Adam optimizer instead of Adadelta, and proposing a mean normal loss function that covers the multitude of normal situations in traffic. Both aspects yield better anomaly detection performance. Besides this, the proposed mean normal loss function lowers the false alarm rate on the combined dataset. The experimental results reach an area under the ROC curve of 86.10 for the Throwing-Action dataset, and 80.13 on the combined dataset, respectively.

Detection of Object Throwing Behavior in Surveillance Videos

TL;DR

The performance of the anomaly detection algorithm is improved by applying the Adam optimizer instead of Adadelta, and proposing a mean normal loss function that covers the multitude of normal situations in traffic, which yields better anomaly detection performance.

Abstract

Anomalous behavior detection is a challenging research area within computer vision. Progress in this area enables automated detection of dangerous behavior using surveillance camera feeds. A dangerous behavior that is often overlooked in other research is the throwing action in traffic flow, which is one of the unique requirements of our Smart City project to enhance public safety. This paper proposes a solution for throwing action detection in surveillance videos using deep learning. At present, datasets for throwing actions are not publicly available. To address the use-case of our Smart City project, we first generate the novel public 'Throwing Action' dataset, consisting of 271 videos of throwing actions performed by traffic participants, such as pedestrians, bicyclists, and car drivers, and 130 normal videos without throwing actions. Second, we compare the performance of different feature extractors for our anomaly detection method on the UCF-Crime and Throwing-Action datasets. The explored feature extractors are the Convolutional 3D (C3D) network, the Inflated 3D ConvNet (I3D) network, and the Multi-Fiber Network (MFNet). Finally, the performance of the anomaly detection algorithm is improved by applying the Adam optimizer instead of Adadelta, and proposing a mean normal loss function that covers the multitude of normal situations in traffic. Both aspects yield better anomaly detection performance. Besides this, the proposed mean normal loss function lowers the false alarm rate on the combined dataset. The experimental results reach an area under the ROC curve of 86.10 for the Throwing-Action dataset, and 80.13 on the combined dataset, respectively.
Paper Structure (24 sections, 3 equations, 7 figures, 5 tables)

This paper contains 24 sections, 3 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Example frame from one of the videos in the proposed generated Throwing-Action dataset, where a pedestrian is throwing an object at another pedestrian.
  • Figure 2: Flow diagram of the proposed methodology. Each video is split into 32 temporal segments. The videos are represented as a bag, and the segments as instances within this bag. Features are then extracted from each segment by the pre-trained C3D, I3D, or MFNet feature extraction networks. Next, these features are provided with an anomaly score by a fully connected neural network, resulting in one anomaly score for every instance in the bag. The network uses a multiple instance learning ranking loss for training.
  • Figure 3: Optimal ROC curves obtained on the testing set of the Throwing-Action dataset when the anomaly detection model is trained with the Adadelta and Adam optimizers.
  • Figure 4: Training loss curves of the anomaly detection model on the Throwing-Action dataset for Adadelta and Adam optimizers, showing the mean loss of every batch during training.
  • Figure 5: Visual examples of each type of image augmentation applied to the Throwing-Action dataset. From left to right and top to bottom: original, (no augmentation), salt and pepper noise, inverted color, horizontal flip, independently scaled color channels and random shearing.
  • ...and 2 more figures