Table of Contents
Fetching ...

CKNN: Cleansed k-Nearest Neighbor for Unsupervised Video Anomaly Detection

Jihun Yi, Sungroh Yoon

TL;DR

This work tackles unsupervised video anomaly detection (UVAD) under the Anomaly Cluster challenge, where anomalies in training data form dense clusters in feature space and mislead NN-based detectors. It introduces Cleansed k-Nearest Neighbor (CKNN), which explicitly cleanses training objects by appearance and motion pseudo-anomaly scores before performing k-NN scoring, and builds two feature banks for robust anomaly assessment. CKNN achieves state-of-the-art UVAD performance across benchmark datasets and even approaches the performance of the anomaly-free OCVAD methods, with a practical, real-time capable pipeline. The method reduces the reliance on anomaly-free data and provides a principled guideline for hyperparameter selection, offering a viable path to deploy automated video anomaly detection in real-world surveillance systems.

Abstract

In this paper, we address the problem of unsupervised video anomaly detection (UVAD). The task aims to detect abnormal events in test video using unlabeled videos as training data. The presence of anomalies in the training data poses a significant challenge in this task, particularly because they form clusters in the feature space. We refer to this property as the "Anomaly Cluster" issue. The condensed nature of these anomalies makes it difficult to distinguish between normal and abnormal data in the training set. Consequently, training conventional anomaly detection techniques using an unlabeled dataset often leads to sub-optimal results. To tackle this difficulty, we propose a new method called Cleansed k-Nearest Neighbor (CKNN), which explicitly filters out the Anomaly Clusters by cleansing the training dataset. Following the k-nearest neighbor algorithm in the feature space provides powerful anomaly detection capability. Although the identified Anomaly Cluster issue presents a significant challenge to applying k-nearest neighbor in UVAD, our proposed cleansing scheme effectively addresses this problem. We evaluate the proposed method on various benchmark datasets and demonstrate that CKNN outperforms the previous state-of-the-art UVAD method by up to 8.5% (from 82.0 to 89.0) in terms of AUROC. Moreover, we emphasize that the performance of the proposed method is comparable to that of the state-of-the-art method trained using anomaly-free data.

CKNN: Cleansed k-Nearest Neighbor for Unsupervised Video Anomaly Detection

TL;DR

This work tackles unsupervised video anomaly detection (UVAD) under the Anomaly Cluster challenge, where anomalies in training data form dense clusters in feature space and mislead NN-based detectors. It introduces Cleansed k-Nearest Neighbor (CKNN), which explicitly cleanses training objects by appearance and motion pseudo-anomaly scores before performing k-NN scoring, and builds two feature banks for robust anomaly assessment. CKNN achieves state-of-the-art UVAD performance across benchmark datasets and even approaches the performance of the anomaly-free OCVAD methods, with a practical, real-time capable pipeline. The method reduces the reliance on anomaly-free data and provides a principled guideline for hyperparameter selection, offering a viable path to deploy automated video anomaly detection in real-world surveillance systems.

Abstract

In this paper, we address the problem of unsupervised video anomaly detection (UVAD). The task aims to detect abnormal events in test video using unlabeled videos as training data. The presence of anomalies in the training data poses a significant challenge in this task, particularly because they form clusters in the feature space. We refer to this property as the "Anomaly Cluster" issue. The condensed nature of these anomalies makes it difficult to distinguish between normal and abnormal data in the training set. Consequently, training conventional anomaly detection techniques using an unlabeled dataset often leads to sub-optimal results. To tackle this difficulty, we propose a new method called Cleansed k-Nearest Neighbor (CKNN), which explicitly filters out the Anomaly Clusters by cleansing the training dataset. Following the k-nearest neighbor algorithm in the feature space provides powerful anomaly detection capability. Although the identified Anomaly Cluster issue presents a significant challenge to applying k-nearest neighbor in UVAD, our proposed cleansing scheme effectively addresses this problem. We evaluate the proposed method on various benchmark datasets and demonstrate that CKNN outperforms the previous state-of-the-art UVAD method by up to 8.5% (from 82.0 to 89.0) in terms of AUROC. Moreover, we emphasize that the performance of the proposed method is comparable to that of the state-of-the-art method trained using anomaly-free data.
Paper Structure (44 sections, 7 equations, 17 figures, 9 tables, 2 algorithms)

This paper contains 44 sections, 7 equations, 17 figures, 9 tables, 2 algorithms.

Figures (17)

  • Figure 1: Anomaly Cluster issue. (a) UVAD methods use unlabeled videos that contain some anomalies as training data. (b) In those videos, each abnormal event is captured in multiple frames. The captured abnormal objects in each frame are similar to each other, and they form a cluster in feature space, as normal objects do. We refer to this as the "Anomaly Cluster" issue.
  • Figure 2: t-SNE tsne of training object features.
  • Figure 3: The effect of increasing $k$ in k-NN.
  • Figure 4: Overall flow of the proposed method. (a) At train phase, an object detector extracts objects in input frames, and CKNN cleanses the objects using two types of pseudo-anomaly scores: appearance and motion. The cleansed objects are stored as features in the two types of feature banks. (b) For inference, CKNN estimates appearance and motion anomaly scores for each object in the test frames using k-NN algorithm. The sum of the two scores is the anomaly score of each object, and the maximum value in a test frame becomes its final anomaly score.
  • Figure 5: UVAD evaluation protocols using datasets having anomaly-free train splits.
  • ...and 12 more figures