Dependency-based Anomaly Detection: a General Framework and Comprehensive Evaluation
Sha Lu, Lin Liu, Kui Yu, Thuc Duy Le, Jixue Liu, Jiuyong Li
TL;DR
DepAD introduces a general, dependency-based anomaly-detection framework that reframes unsupervised detection as supervised variable selection and prediction across three phases. By employing off-the-shelf techniques for relevant-variable selection, per-variable prediction, and robust anomaly scoring, it enables domain-tailored detectors with interpretable explanations of detected anomalies. Empirical results across 32 real-world datasets show two instantiations, FBED-CART-PS and FBED-CART-Sum, outperforming nine state-of-the-art baselines in most settings and demonstrating strong performance in high-dimensional data, along with substantive interpretability through dependency deviations. The framework provides practical guidance for technique selection, emphasizes interpretability, and suggests avenues for ensemble extensions to further improve robustness and coverage of dependency-based anomalies.
Abstract
Anomaly detection is crucial for understanding unusual behaviors in data, as anomalies offer valuable insights. This paper introduces Dependency-based Anomaly Detection (DepAD), a general framework that utilizes variable dependencies to uncover meaningful anomalies with better interpretability. DepAD reframes unsupervised anomaly detection as supervised feature selection and prediction tasks, which allows users to tailor anomaly detection algorithms to their specific problems and data. We extensively evaluate representative off-the-shelf techniques for the DepAD framework. Two DepAD algorithms emerge as all-rounders and superior performers in handling a wide range of datasets compared to nine state-of-the-art anomaly detection methods. Additionally, we demonstrate that DepAD algorithms provide new and insightful interpretations for detected anomalies.
