Table of Contents
Fetching ...

Exploiting Autoencoder's Weakness to Generate Pseudo Anomalies

Marcella Astrid, Muhammad Zaigham Zaheer, Djamila Aouada, Seung-Ik Lee

TL;DR

The paper tackles anomaly detection with autoencoders by addressing the AE weakness of reconstructing anomalies well. It introduces a cooperative two-network system, where a noise generator G learns adaptive noise to produce pseudo anomalies X^P = X^N + ΔX (ΔX = G(X^N)) that push the AE boundary, while the autoencoder F learns to reconstruct normal data but poorly reconstruct pseudo anomalies. Empirical results across Ped2, Avenue, ShanghaiTech, CIFAR-10, and KDDCUP99 show improved discriminability and competitive performance with state-of-the-art methods, without relying on strong inductive biases. The approach is demonstrated to be generic across video, image, and network intrusion tasks, with favorable test-time efficiency and robust hyperparameter behavior.

Abstract

Due to the rare occurrence of anomalous events, a typical approach to anomaly detection is to train an autoencoder (AE) with normal data only so that it learns the patterns or representations of the normal training data. At test time, the trained AE is expected to well reconstruct normal but to poorly reconstruct anomalous data. However, contrary to the expectation, anomalous data is often well reconstructed as well. In order to further separate the reconstruction quality between normal and anomalous data, we propose creating pseudo anomalies from learned adaptive noise by exploiting the aforementioned weakness of AE, i.e., reconstructing anomalies too well. The generated noise is added to the normal data to create pseudo anomalies. Extensive experiments on Ped2, Avenue, ShanghaiTech, CIFAR-10, and KDDCUP datasets demonstrate the effectiveness and generic applicability of our approach in improving the discriminative capability of AEs for anomaly detection.

Exploiting Autoencoder's Weakness to Generate Pseudo Anomalies

TL;DR

The paper tackles anomaly detection with autoencoders by addressing the AE weakness of reconstructing anomalies well. It introduces a cooperative two-network system, where a noise generator G learns adaptive noise to produce pseudo anomalies X^P = X^N + ΔX (ΔX = G(X^N)) that push the AE boundary, while the autoencoder F learns to reconstruct normal data but poorly reconstruct pseudo anomalies. Empirical results across Ped2, Avenue, ShanghaiTech, CIFAR-10, and KDDCUP99 show improved discriminability and competitive performance with state-of-the-art methods, without relying on strong inductive biases. The approach is demonstrated to be generic across video, image, and network intrusion tasks, with favorable test-time efficiency and robust hyperparameter behavior.

Abstract

Due to the rare occurrence of anomalous events, a typical approach to anomaly detection is to train an autoencoder (AE) with normal data only so that it learns the patterns or representations of the normal training data. At test time, the trained AE is expected to well reconstruct normal but to poorly reconstruct anomalous data. However, contrary to the expectation, anomalous data is often well reconstructed as well. In order to further separate the reconstruction quality between normal and anomalous data, we propose creating pseudo anomalies from learned adaptive noise by exploiting the aforementioned weakness of AE, i.e., reconstructing anomalies too well. The generated noise is added to the normal data to create pseudo anomalies. Extensive experiments on Ped2, Avenue, ShanghaiTech, CIFAR-10, and KDDCUP datasets demonstrate the effectiveness and generic applicability of our approach in improving the discriminative capability of AEs for anomaly detection.
Paper Structure (46 sections, 11 equations, 10 figures, 8 tables)

This paper contains 46 sections, 11 equations, 10 figures, 8 tables.

Figures (10)

  • Figure 1: Illustration on how our method limits the reconstruction capability of an AE across training iterations: (a) AE can reconstruct both normal data and anomalous data, (b) The noise generator generates a noise $\Delta X$ to produce pseudo anomalies within the reconstruction boundary of AE, (c) AE learns to poorly reconstruct pseudo anomalies, (d) Pseudo anomalies are generated to adapt to the new reconstruction boundary, and (e) AE learns to poorly reconstruct the new pseudo anomalies.
  • Figure 2: Comparison of pseudo anomaly generation using (a) skipping frames astrid2021learningastrid2021synthetic, (b) patching astrid2021learning, and (c) our method. Our method is learnable and does not impose any strong inductive bias.
  • Figure 3: Our method consists of a main autoencoder $\mathcal{F}$ and a noise generator $\mathcal{G}$ that are trained alternately: (a) A pseudo anomaly instance is constructed by adding noise to the normal data, where the noise is generated by $\mathcal{G}$. $\mathcal{G}$ learns to generate as much noise as $\mathcal{F}$ is able to reconstruct the pseudo anomalies. In other words, $\mathcal{G}$ is trained to generate anomalies (maximizing noise) that are within the reconstruction boundary of $\mathcal{F}$ (minimizing reconstruction loss). (b) $\mathcal{F}$ is trained to not reconstruct anomalies when the inputs are generated pseudo anomalies and trained to reconstruct normal data when the inputs are normal. During test time, only $\mathcal{F}$ is used. The contrast of $\Delta X$ has been adjusted for visualization clarity.
  • Figure 4: Visualizations of (a) pseudo anomalies constructed from a normal frame by adding Gaussian noise with various $\sigma$ values, where the random noise amplitude is affected by $\sigma$; and (b) noise generated by $\mathcal{G}$ and the respective pseudo anomalies generated using our proposed learning to generate pseudo anomalies mechanism across different training iterations. Compared to the random noise in (a), the noise generated in our propose mechanism changes with training iterations as $\mathcal{G}$ adapts to the reconstruction boundary of $\mathcal{F}$.
  • Figure 5: The distribution of reconstruction errors of the baseline and our model for normal (blue) and anomalous (red) data in several videos. It is evident that the reconstruction error distribution becomes more discriminative with our model compared to the baseline.
  • ...and 5 more figures