Table of Contents
Fetching ...

VideoPatchCore: An Effective Method to Memorize Normality for Video Anomaly Detection

Sunghyun Ahn, Youngwan Jo, Kijung Lee, Sanghyun Park

TL;DR

This work proposes an effective memory method for VAD that prioritizes memory optimization and configures three types of memory tailored to the characteristics of video data, achieving good performance comparable to state-of-the-art methods.

Abstract

Video anomaly detection (VAD) is a crucial task in video analysis and surveillance within computer vision. Currently, VAD is gaining attention with memory techniques that store the features of normal frames. The stored features are utilized for frame reconstruction, identifying an abnormality when a significant difference exists between the reconstructed and input frames. However, this approach faces several challenges due to the simultaneous optimization required for both the memory and encoder-decoder model. These challenges include increased optimization difficulty, complexity of implementation, and performance variability depending on the memory size. To address these challenges,we propose an effective memory method for VAD, called VideoPatchCore. Inspired by PatchCore, our approach introduces a structure that prioritizes memory optimization and configures three types of memory tailored to the characteristics of video data. This method effectively addresses the limitations of existing memory-based methods, achieving good performance comparable to state-of-the-art methods. Furthermore, our method requires no training and is straightforward to implement, making VAD tasks more accessible. Our code is available online at github.com/SkiddieAhn/Paper-VideoPatchCore.

VideoPatchCore: An Effective Method to Memorize Normality for Video Anomaly Detection

TL;DR

This work proposes an effective memory method for VAD that prioritizes memory optimization and configures three types of memory tailored to the characteristics of video data, achieving good performance comparable to state-of-the-art methods.

Abstract

Video anomaly detection (VAD) is a crucial task in video analysis and surveillance within computer vision. Currently, VAD is gaining attention with memory techniques that store the features of normal frames. The stored features are utilized for frame reconstruction, identifying an abnormality when a significant difference exists between the reconstructed and input frames. However, this approach faces several challenges due to the simultaneous optimization required for both the memory and encoder-decoder model. These challenges include increased optimization difficulty, complexity of implementation, and performance variability depending on the memory size. To address these challenges,we propose an effective memory method for VAD, called VideoPatchCore. Inspired by PatchCore, our approach introduces a structure that prioritizes memory optimization and configures three types of memory tailored to the characteristics of video data. This method effectively addresses the limitations of existing memory-based methods, achieving good performance comparable to state-of-the-art methods. Furthermore, our method requires no training and is straightforward to implement, making VAD tasks more accessible. Our code is available online at github.com/SkiddieAhn/Paper-VideoPatchCore.
Paper Structure (24 sections, 7 equations, 8 figures, 8 tables)

This paper contains 24 sections, 7 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: (a) Conventional memory-augmented methods optimize both the memory and the encoder-decoder model, simultaneously. (b) The proposed VideoPatchCore focuses on memory optimization.
  • Figure 2: Architecture of VideoPatchCore (VPC). It consists of two streams (local and global) and three memory banks (spatial, temporal, and high-level semantic). The Spatial Memory Bank deals with the appearance information of objects to identify inappropriate objects. The Temporal Memory Bank handles the motion information of objects to detect inappropriate actions. The High-level Semantic Memory Bank processes the global context of frames to identify anomalies related to multiple objects or scenes.
  • Figure 3: Patch Partition Method. 'objects' denotes the number of objects(n), and objects are processed in parallel considering batch processing.
  • Figure 4: Comparison of the anomaly scores between VideoPatchCore and PatchCore in the Avenue dataset.
  • Figure S1: Comparison of anomaly scores between VPC and PC in the SHTech dataset.
  • ...and 3 more figures