Table of Contents
Fetching ...

Real-Time Anomaly Detection with Synthetic Anomaly Monitoring (SAM)

Emanuele Luzio, Moacir Antonelli Ponti

TL;DR

This work tackles real-time anomaly detection by introducing Synthetic Anomaly Monitoring (SAM), a causal-inference–driven method that treats each feature as a separate control unit and detects anomalies as deviations from feature-wise counterfactuals. The framework uses synthetic control concepts to estimate per-feature counterfactuals $S_i$ and computes residuals $\Delta_i = X_i - S_i$ (with optional normalization) to form interpretable anomaly scores, enabling online inference via a linear dot-product structure. SAM is evaluated against standard baselines (Isolation Forest, LOF, kNN, One-Class SVM) across five real datasets, with multiple variants incorporating RANSAC and normalization; results show robust ROC AUC and PR AUC performance, particularly on high-signal datasets like http and kdd. The study highlights SAM’s balance of accuracy and interpretability, suggesting strong potential for real-time monitoring in dynamic environments, and points to future work on explainability and incremental online learning. Overall, SAM provides a principled, causality-aware alternative to traditional anomaly detectors with demonstrated cross-domain effectiveness.

Abstract

Anomaly detection is essential for identifying rare and significant events across diverse domains such as finance, cybersecurity, and network monitoring. This paper presents Synthetic Anomaly Monitoring (SAM), an innovative approach that applies synthetic control methods from causal inference to improve both the accuracy and interpretability of anomaly detection processes. By modeling normal behavior through the treatment of each feature as a control unit, SAM identifies anomalies as deviations within this causal framework. We conducted extensive experiments comparing SAM with established benchmark models, including Isolation Forest, Local Outlier Factor (LOF), k-Nearest Neighbors (kNN), and One-Class Support Vector Machine (SVM), across five diverse datasets, including Credit Card Fraud, HTTP Dataset CSIC 2010, and KDD Cup 1999, among others. Our results demonstrate that SAM consistently delivers robust performance, highlighting its potential as a powerful tool for real-time anomaly detection in dynamic and complex environments.

Real-Time Anomaly Detection with Synthetic Anomaly Monitoring (SAM)

TL;DR

This work tackles real-time anomaly detection by introducing Synthetic Anomaly Monitoring (SAM), a causal-inference–driven method that treats each feature as a separate control unit and detects anomalies as deviations from feature-wise counterfactuals. The framework uses synthetic control concepts to estimate per-feature counterfactuals and computes residuals (with optional normalization) to form interpretable anomaly scores, enabling online inference via a linear dot-product structure. SAM is evaluated against standard baselines (Isolation Forest, LOF, kNN, One-Class SVM) across five real datasets, with multiple variants incorporating RANSAC and normalization; results show robust ROC AUC and PR AUC performance, particularly on high-signal datasets like http and kdd. The study highlights SAM’s balance of accuracy and interpretability, suggesting strong potential for real-time monitoring in dynamic environments, and points to future work on explainability and incremental online learning. Overall, SAM provides a principled, causality-aware alternative to traditional anomaly detectors with demonstrated cross-domain effectiveness.

Abstract

Anomaly detection is essential for identifying rare and significant events across diverse domains such as finance, cybersecurity, and network monitoring. This paper presents Synthetic Anomaly Monitoring (SAM), an innovative approach that applies synthetic control methods from causal inference to improve both the accuracy and interpretability of anomaly detection processes. By modeling normal behavior through the treatment of each feature as a control unit, SAM identifies anomalies as deviations within this causal framework. We conducted extensive experiments comparing SAM with established benchmark models, including Isolation Forest, Local Outlier Factor (LOF), k-Nearest Neighbors (kNN), and One-Class Support Vector Machine (SVM), across five diverse datasets, including Credit Card Fraud, HTTP Dataset CSIC 2010, and KDD Cup 1999, among others. Our results demonstrate that SAM consistently delivers robust performance, highlighting its potential as a powerful tool for real-time anomaly detection in dynamic and complex environments.

Paper Structure

This paper contains 13 sections, 13 equations, 3 figures, 2 tables, 2 algorithms.

Figures (3)

  • Figure 1: Critical difference plots for each method across all datasets, considering ROC AUC and PR AUC metrics. SAM-best considers the best scenario of SAM for each dataset ($p$-value$=0.07$).
  • Figure 2: ROC AUC Curves of the experiment
  • Figure 3: PR AUC Curves of the experiment