Table of Contents
Fetching ...

xai_evals : A Framework for Evaluating Post-Hoc Local Explanation Methods

Pratinav Seth, Yashwardhan Rathore, Neeraj Kumar Singh, Chintan Chitroda, Vinay Kumar Sankarapu

TL;DR

Xai_evals addresses the need for standardized evaluation of post-hoc local explanations for ML/DL models by providing a unified Python package that generates, benchmarks, and evaluates explanations across tabular and image data. It integrates popular explainers like SHAP, LIME, Grad-CAM, Integrated Gradients, and DlBacktrace, and offers a multi-metric evaluation suite including faithfulness, sensitivity, and robustness. The framework supports both traditional ML models and deep learning architectures, enabling cross-domain benchmarking and reproducibility. This work facilitates transparency, trust, and regulatory compliance by making explanation quality measurable and comparable, with practical implications for high-stakes AI deployment.

Abstract

The growing complexity of machine learning and deep learning models has led to an increased reliance on opaque "black box" systems, making it difficult to understand the rationale behind predictions. This lack of transparency is particularly challenging in high-stakes applications where interpretability is as important as accuracy. Post-hoc explanation methods are commonly used to interpret these models, but they are seldom rigorously evaluated, raising concerns about their reliability. The Python package xai_evals addresses this by providing a comprehensive framework for generating, benchmarking, and evaluating explanation methods across both tabular and image data modalities. It integrates popular techniques like SHAP, LIME, Grad-CAM, Integrated Gradients (IG), and Backtrace, while supporting evaluation metrics such as faithfulness, sensitivity, and robustness. xai_evals enhances the interpretability of machine learning models, fostering transparency and trust in AI systems. The library is open-sourced at https://pypi.org/project/xai-evals/ .

xai_evals : A Framework for Evaluating Post-Hoc Local Explanation Methods

TL;DR

Xai_evals addresses the need for standardized evaluation of post-hoc local explanations for ML/DL models by providing a unified Python package that generates, benchmarks, and evaluates explanations across tabular and image data. It integrates popular explainers like SHAP, LIME, Grad-CAM, Integrated Gradients, and DlBacktrace, and offers a multi-metric evaluation suite including faithfulness, sensitivity, and robustness. The framework supports both traditional ML models and deep learning architectures, enabling cross-domain benchmarking and reproducibility. This work facilitates transparency, trust, and regulatory compliance by making explanation quality measurable and comparable, with practical implications for high-stakes AI deployment.

Abstract

The growing complexity of machine learning and deep learning models has led to an increased reliance on opaque "black box" systems, making it difficult to understand the rationale behind predictions. This lack of transparency is particularly challenging in high-stakes applications where interpretability is as important as accuracy. Post-hoc explanation methods are commonly used to interpret these models, but they are seldom rigorously evaluated, raising concerns about their reliability. The Python package xai_evals addresses this by providing a comprehensive framework for generating, benchmarking, and evaluating explanation methods across both tabular and image data modalities. It integrates popular techniques like SHAP, LIME, Grad-CAM, Integrated Gradients (IG), and Backtrace, while supporting evaluation metrics such as faithfulness, sensitivity, and robustness. xai_evals enhances the interpretability of machine learning models, fostering transparency and trust in AI systems. The library is open-sourced at https://pypi.org/project/xai-evals/ .

Paper Structure

This paper contains 29 sections, 13 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: xai_evals Library Overview.
  • Figure 2: Illustration of Overlay GradCAM attribution map over Image Sample. The attribution map highlights the important regions in the image that contributed to the model's prediction.