Anomaly Detection with Variance Stabilized Density Estimation

Amit Rozner; Barak Battash; Henry Li; Lior Wolf; Ofir Lindenbaum

Anomaly Detection with Variance Stabilized Density Estimation

Amit Rozner, Barak Battash, Henry Li, Lior Wolf, Ofir Lindenbaum

TL;DR

The paper addresses the challenge of effective anomaly detection in tabular data by reframing density estimation through variance stabilization. It introduces variance-stabilized density estimation (VSDE), formulated as a regularized likelihood objective that penalizes the variance of the log-density around normal samples, and implements it with autoregressive probabilistic normalized networks (PNNs) plus a spectral ensemble over feature permutations. Empirical evidence on 52 public datasets demonstrates state-of-the-art performance and robustness to hyperparameter choices, with ablations confirming the importance of variance regularization and permutation-based ensemble components. The approach reduces the need for dataset-specific tuning and provides a scalable, principled framework for density-based anomaly detection with strong practical impact on real-world tabular data.

Abstract

We propose a modified density estimation problem that is highly effective for detecting anomalies in tabular data. Our approach assumes that the density function is relatively stable (with lower variance) around normal samples. We have verified this hypothesis empirically using a wide range of real-world data. Then, we present a variance-stabilized density estimation problem for maximizing the likelihood of the observed samples while minimizing the variance of the density around normal samples. To obtain a reliable anomaly detector, we introduce a spectral ensemble of autoregressive models for learning the variance-stabilized distribution. We have conducted an extensive benchmark with 52 datasets, demonstrating that our method leads to state-of-the-art results while alleviating the need for data-specific hyperparameter tuning. Finally, we have used an ablation study to demonstrate the importance of each of the proposed components, followed by a stability analysis evaluating the robustness of our model.

Anomaly Detection with Variance Stabilized Density Estimation

TL;DR

Abstract

Paper Structure (26 sections, 7 equations, 8 figures, 6 tables)

This paper contains 26 sections, 7 equations, 8 figures, 6 tables.

Introduction
Related work
Method
Problem Definition
Intuition
Empirical Evidence
Regularized density estimation
Feature permutation ensemble
Experiments
Synthetic Evaluation
Real Data
Baseline methods
Results
Ablation Study
Variance stabilization
...and 11 more sections

Figures (8)

Figure 1: (a) The proposed framework for anomaly detection. Our method involves using multiple versions of permuted tabular data, which are fed into a Probabilistic Normalized Network (PNN). The PNN is designed to model the density of normal samples as uniform in a compact domain. Each PNN is trained to minimize a regularized negative log-likelihood loss (see Eq. \ref{['eq:full_objective']}). Since our PNN is implemented using an autoregressive model, we use a spectral ensemble of the learned log-likelihood functions as an anomaly score for unseen samples. (b) Illustration of the proposed variance-stabilized density estimation (VSDE) vs. standard (un-regularized) maximum likelihood estimation (MLE) for one-dimensional data. During training, the VSDE learns a more "stable" density estimate around normal samples. This results in a better likelihood estimate for distinguishing between normal and abnormal samples at test time. These findings are supported empirically by our experimental results.
Figure 2: Evaluation of our "stable" density assumption. We plot the mean log-likelihood variance ratio between normal and anomalous samples (see definition in the Intuition section below) for 52 publicly available tabular datasets. Values above the dashed line are greater than 1. Our results indicate that in most datasets, the density is more stable (with lower variance) around normal samples than anomalies. This corroborates our assumptions and motivates our proposed modified density estimation problem for anomaly detection.
Figure 3: Synthetic example demonstrating the effect of density stabilization. White dots represent normal samples $x_n\in X_N$, while yellow represents anomalies $x_a\in X_A$. (a): scaled unregularized log-likelihood estimation. (b): the proposed scaled regularized log-likelihood estimate. Using the proposed stabilized density estimate (right) improved the AUC of anomaly detection from 79.8 to 98.3 in this example.
Figure 4: (a) A Dolan-More performance profile dolan2002benchmarking comparing AUC scores of 8 algorithms applied to 52 tabular datasets. For each method and each value of $\theta$ ($x$-axis), we calculate the ratio of datasets on which the method performs better or equal to $\theta$ multiplied by the best AUC for the corresponding dataset. Specifically, for a specific method we calculate $\frac{1}{N_{data}}\sum_{j} \text{AUC}_{j}\geq \theta \cdot \text{AUC}^{\text{b}est}_{j}$, where $\text{AUC}^{\text{b}est}_{j}$ is the best AUC for dataset $j$ and $N_{data}$ is the number of datasets. The ideal algorithm would achieve the best score on all datasets and thus reach the left top corner of the plot for $\theta=1$. Our algorithm yields better results than all baselines, surpassing ICL on values between $\theta=0.95$ and $\theta=0.82$. Furthermore, our method covers all datasets (ratio equals 1) for $\theta=0.82$ and outperforms the second best, ICL shenkar2022anomaly, which achieves the same at $\theta=0.69$. This suggests that using our method on all datasets will never be worse than the leading method by more than 18%. (b) Box plots comparing the results of all methods on the 52 evaluated datasets. Each box presents the mean (red) and median (black) as well as other statistics (Q1, Q3, etc.).
Figure 5: Stability analysis for the regularization parameter $\lambda$ balances the likelihood and the variance loss. $\lambda=0$ indicates that no variance loss is applied. The numbers present the ratio between the AUC and the AUC obtained without regularization ($\lambda=0$). This heatmap indicates the advantage of the proposed regularization for anomaly detection. Furthermore, observe the stability of the AUC for different values of $\lambda$.
...and 3 more figures

Anomaly Detection with Variance Stabilized Density Estimation

TL;DR

Abstract

Anomaly Detection with Variance Stabilized Density Estimation

Authors

TL;DR

Abstract

Table of Contents

Figures (8)