Constricting Normal Latent Space for Anomaly Detection with Normal-only Training Data

Marcella Astrid; Muhammad Zaigham Zaheer; Seung-Ik Lee

Constricting Normal Latent Space for Anomaly Detection with Normal-only Training Data

Marcella Astrid, Muhammad Zaigham Zaheer, Seung-Ik Lee

TL;DR

The paper tackles anomaly detection with only normal training data by training an autoencoder and introducing a latent-space constriction loss to prevent the model from reconstructing anomalies. The authors formulate a total loss $L = L^R + \lambda L^C$, where $L^C$ constrains latent features $F_{j,i} \in \mathbb{R}^{T' \times C'}$ to lie inside or on a norm sphere, via two variants: inside-sphere and on-sphere. They evaluate on Ped2, Avenue, and ShanghaiTech, using PSNR-based normalcy scoring to derive frame-level anomaly scores, and demonstrate that both constriction strategies improve over a baseline while maintaining zero extra test-time cost. The results position the proposed method as competitive with memory-based approaches and favorable for practical deployment in video anomaly detection, given its simplicity and effectiveness without requiring pseudo anomalies or memory modules. The work contributes a direct training-time constraint on latent space that enhances discriminability between normal and anomalous inputs while preserving reconstruction-based decision making.

Abstract

In order to devise an anomaly detection model using only normal training data, an autoencoder (AE) is typically trained to reconstruct the data. As a result, the AE can extract normal representations in its latent space. During test time, since AE is not trained using real anomalies, it is expected to poorly reconstruct the anomalous data. However, several researchers have observed that it is not the case. In this work, we propose to limit the reconstruction capability of AE by introducing a novel latent constriction loss, which is added to the existing reconstruction loss. By using our method, no extra computational cost is added to the AE during test time. Evaluations using three video anomaly detection benchmark datasets, i.e., Ped2, Avenue, and ShanghaiTech, demonstrate the effectiveness of our method in limiting the reconstruction capability of AE, which leads to a better anomaly detection model.

Constricting Normal Latent Space for Anomaly Detection with Normal-only Training Data

TL;DR

, where

constrains latent features

to lie inside or on a norm sphere, via two variants: inside-sphere and on-sphere. They evaluate on Ped2, Avenue, and ShanghaiTech, using PSNR-based normalcy scoring to derive frame-level anomaly scores, and demonstrate that both constriction strategies improve over a baseline while maintaining zero extra test-time cost. The results position the proposed method as competitive with memory-based approaches and favorable for practical deployment in video anomaly detection, given its simplicity and effectiveness without requiring pseudo anomalies or memory modules. The work contributes a direct training-time constraint on latent space that enhances discriminability between normal and anomalous inputs while preserving reconstruction-based decision making.

Abstract

Paper Structure (15 sections, 5 equations, 4 figures, 1 table)

This paper contains 15 sections, 5 equations, 4 figures, 1 table.

Introduction
Related Works
Methodology
Training AE to represent normal data
Constricting the normal data latent space
Constricting inside the sphere
Constricting on the surface of the sphere
Inference
Experiments
Datasets
Experimental setup
Comparisons with the baseline
Comparisons with other methods
Hyperparameters evaluation
Conclusion

Figures (4)

Figure 1: Overall configuration of our proposed method which consists of an autoencoder (AE) trained using reconstruction loss and our proposed latent constriction loss. The latent constriction loss restrains the normal features into smaller space in order to limit the reconstruction capability of the AE.
Figure 2: Two types of the proposed latent constriction losses: (a) constricting inside the norm sphere, (b) constricting on the surface of the norm sphere.
Figure 3: Qualitative comparisons of the baseline and our method in test samples from each dataset. The top and bottom rows are the outputs of the AE and the respective reconstruction error heatmaps. Reconstruction error heatmaps are computed using reconstruction error in the frame followed by min-max normalization in a frame. Red boxes mark the anomalous regions. Our method successfully distorts the anomalous regions better than the baseline.
Figure 4: Evaluation of the hyperparameters used in our method on Ped2 dataset in constricting (a)-(b) inside the norm sphere and (c)-(d) on the surface of the norm sphere. Different loss weighting $\lambda$ (equation \ref{['eq:totalloss']}) and constricting norm $\alpha$ (equation \ref{['eq:insidesphere']} & equation \ref{['eq:onsphere']}) are used. As mostly our method (red solid line) outperforms the baseline (green dotted line), our method is robust towards different hyperparameter values.

Constricting Normal Latent Space for Anomaly Detection with Normal-only Training Data

TL;DR

Abstract

Constricting Normal Latent Space for Anomaly Detection with Normal-only Training Data

Authors

TL;DR

Abstract

Table of Contents

Figures (4)