A Latent Space Correlation-Aware Autoencoder for Anomaly Detection in Skewed Data

Padmaksha Roy

A Latent Space Correlation-Aware Autoencoder for Anomaly Detection in Skewed Data

Padmaksha Roy

TL;DR

This work tackles unsupervised anomaly detection in skewed, high-dimensional sensor data by pairing a kernelized autoencoder with a robust Mahalanobis distance in the latent space and mutual-information-based regularization to preserve input-space correlations. The resulting DLSCA-AE optimizes a multi-objective loss that balances latent-space correlation (via a robust MD distance) and reconstruction fidelity, enabling effective detection of both near and far anomalies. Empirical results on cybersecurity and medical datasets show improved reconstruction metrics (MSE/MAE) and superior separation of anomalies compared with several baselines, along with insights into hyperparameter settings and training stability. Overall, the method advances robust latent-space anomaly detection in skewed, correlated data and suggests avenues for improving OOD generalization through cross-domain correlation bounding.

Abstract

Unsupervised learning-based anomaly detection in latent space has gained importance since discriminating anomalies from normal data becomes difficult in high-dimensional space. Both density estimation and distance-based methods to detect anomalies in latent space have been explored in the past. These methods prove that retaining valuable properties of input data in latent space helps in the better reconstruction of test data. Moreover, real-world sensor data is skewed and non-Gaussian in nature, making mean-based estimators unreliable for skewed data. Again, anomaly detection methods based on reconstruction error rely on Euclidean distance, which does not consider useful correlation information in the feature space and also fails to accurately reconstruct the data when it deviates from the training distribution. In this work, we address the limitations of reconstruction error-based autoencoders and propose a kernelized autoencoder that leverages a robust form of Mahalanobis distance (MD) to measure latent dimension correlation to effectively detect both near and far anomalies. This hybrid loss is aided by the principle of maximizing the mutual information gain between the latent dimension and the high-dimensional prior data space by maximizing the entropy of the latent space while preserving useful correlation information of the original data in the low-dimensional latent space. The multi-objective function has two goals -- it measures correlation information in the latent feature space in the form of robust MD distance and simultaneously tries to preserve useful correlation information from the original data space in the latent space by maximizing mutual information between the prior and latent space.

A Latent Space Correlation-Aware Autoencoder for Anomaly Detection in Skewed Data

TL;DR

Abstract

Paper Structure (14 sections, 2 theorems, 37 equations, 10 figures, 3 tables)

This paper contains 14 sections, 2 theorems, 37 equations, 10 figures, 3 tables.

Introduction
Related Work
Problem Formulation
Robust Hybrid Error with MD in Latent Space
Objective function
Experiments
Datasets
Baseline Methods
Ablation Study
Hyperparameter Sensitivity
Conclusion
Proof of Theorem 1
Proof of Theorem 2
Histograms of Features and Hyper-parameter sensitivity.

Key Result

theorem thmcountertheorem

The sample median is $2/n$ times more efficient than the sample mean at exponential distribution. The result is consistent with the fact that the skewed distribution has a heavy right tail, which can cause the mean to be affected by outliers and skewness. Here, $n$ denotes the sample size.

Figures (10)

Figure 1: Deep Latent Space Correlation-Aware Autoencoder(DLSCA-AE).
Figure 2: (a),(b) shows the reconstructed space when the robust MD is used as a regularizer and (c),(d) corresponds to the reconstructed space with standard MD with mean and covariance as regularizer.
Figure 3: (a), (b), (c), (d) shows the validation error during training with sample sizes 50, 100, 200, and 300 respectively. We see that the training is more stable when the batch size is higher during each epoch of training.
Figure 4: Histogram of normal data features of CSE-CIC-IDS dataset.
Figure 5: Histogram of near anomaly features(less skewed) in CIC-CSE-IDS dataset.
...and 5 more figures

Theorems & Definitions (4)

theorem thmcountertheorem
theorem thmcountertheorem
proof
proof

A Latent Space Correlation-Aware Autoencoder for Anomaly Detection in Skewed Data

TL;DR

Abstract

A Latent Space Correlation-Aware Autoencoder for Anomaly Detection in Skewed Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (4)