Table of Contents
Fetching ...

AcME-AD: Accelerated Model Explanations for Anomaly Detection

Valentina Zaccaria, David Dandolo, Chiara Masiero, Gian Antonio Susto

TL;DR

AcME-AD introduces a model-agnostic, perturbation-based framework for explaining tabular anomaly detection by deriving four local sub-scores (Delta, Ratio, Change, Distance-to-change) from feature perturbations and aggregating them into a local importance score $I_j(\mathbf{x}) = w_D D_j + w_C C_j + w_Q Q_j + w_R R_j$. It enables rapid, what-if visualizations and a global interpretability view, addressing both anomaly scores $m(\mathbf{x})$ and predicted classes via a threshold $t$, with a flexible weighting scheme and a constant-time, pre-computed statistic approach that outperforms KernelSHAP in speed. The authors validate AcME-AD on synthetic data and real-world Glass and Satellite datasets, showing competitive feature rankings relative to KernelSHAP and LocalDIFFI, while delivering substantial runtime gains and usable proxy evaluations through feature selection. The work provides open-source code and demonstrates practical impact for real-time root-cause analysis and decision making in anomaly detection contexts.

Abstract

Pursuing fast and robust interpretability in Anomaly Detection is crucial, especially due to its significance in practical applications. Traditional Anomaly Detection methods excel in outlier identification but are often black-boxes, providing scant insights into their decision-making process. This lack of transparency compromises their reliability and hampers their adoption in scenarios where comprehending the reasons behind anomaly detection is vital. At the same time, getting explanations quickly is paramount in practical scenarios. To bridge this gap, we present AcME-AD, a novel approach rooted in Explainable Artificial Intelligence principles, designed to clarify Anomaly Detection models for tabular data. AcME-AD transcends the constraints of model-specific or resource-heavy explainability techniques by delivering a model-agnostic, efficient solution for interoperability. It offers local feature importance scores and a what-if analysis tool, shedding light on the factors contributing to each anomaly, thus aiding root cause analysis and decision-making. This paper elucidates AcME-AD's foundation, its benefits over existing methods, and validates its effectiveness with tests on both synthetic and real datasets. AcME-AD's implementation and experiment replication code is accessible in a public repository.

AcME-AD: Accelerated Model Explanations for Anomaly Detection

TL;DR

AcME-AD introduces a model-agnostic, perturbation-based framework for explaining tabular anomaly detection by deriving four local sub-scores (Delta, Ratio, Change, Distance-to-change) from feature perturbations and aggregating them into a local importance score . It enables rapid, what-if visualizations and a global interpretability view, addressing both anomaly scores and predicted classes via a threshold , with a flexible weighting scheme and a constant-time, pre-computed statistic approach that outperforms KernelSHAP in speed. The authors validate AcME-AD on synthetic data and real-world Glass and Satellite datasets, showing competitive feature rankings relative to KernelSHAP and LocalDIFFI, while delivering substantial runtime gains and usable proxy evaluations through feature selection. The work provides open-source code and demonstrates practical impact for real-time root-cause analysis and decision making in anomaly detection contexts.

Abstract

Pursuing fast and robust interpretability in Anomaly Detection is crucial, especially due to its significance in practical applications. Traditional Anomaly Detection methods excel in outlier identification but are often black-boxes, providing scant insights into their decision-making process. This lack of transparency compromises their reliability and hampers their adoption in scenarios where comprehending the reasons behind anomaly detection is vital. At the same time, getting explanations quickly is paramount in practical scenarios. To bridge this gap, we present AcME-AD, a novel approach rooted in Explainable Artificial Intelligence principles, designed to clarify Anomaly Detection models for tabular data. AcME-AD transcends the constraints of model-specific or resource-heavy explainability techniques by delivering a model-agnostic, efficient solution for interoperability. It offers local feature importance scores and a what-if analysis tool, shedding light on the factors contributing to each anomaly, thus aiding root cause analysis and decision-making. This paper elucidates AcME-AD's foundation, its benefits over existing methods, and validates its effectiveness with tests on both synthetic and real datasets. AcME-AD's implementation and experiment replication code is accessible in a public repository.
Paper Structure (17 sections, 6 equations, 10 figures, 2 tables)

This paper contains 17 sections, 6 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: The what-if analysis tool for local interpretability of prediction computed by the Isolation Forest described in Section \ref{['subsec:realworlddatasets']} for Sample 2 of the Glass Dataset.
  • Figure 2: The Single Feature Exploration plot to interpret the prediction computed by the Isolation Forest described in Section \ref{['subsec:realworlddatasets']} for Sample 2 of the Glass Dataset.
  • Figure 3: AcME-AD Global importance scores computed to interpret the Isolation Forest model trained on the Glass Dataset as described in Section \ref{['subsec:realworlddatasets']}.
  • Figure 4: Synthetic datasets projected onto the first two dimensions
  • Figure 5: Comparison of feature rankings computed by AcME-AD and KernelSHAP for an Isolation Forest trained on the synthetic dataset as detailed in \ref{['subsec:syntheticdatasets']}. AcME-AD in the left column, KernelSHAP in the right one. First row represents the feature ranking resulting from outliers along the x-axis, second row represents the feature ranking resulting from outliers along the y-axis, and the third rows represents the feature ranking resulting from outliers along bisec.
  • ...and 5 more figures