Table of Contents
Fetching ...

BSED: Baseline Shapley-Based Explainable Detector

Michihiro Kuroki, Toshihiko Yamasaki

TL;DR

BSED introduces Baseline Shapley-based Explainable Detector, a model-agnostic XAI method for object detection that enforces axiomatic validity by extending Baseline Shapley to detections. It replaces the prohibitive exact Shapley computation with a multilayer, Monte Carlo-based approximation, reducing complexity from $O(2^{|N_f|})$ to $O(N)$ inferences while generating attribution maps that reflect both positive and negative contributions. The score function combines localization (IoU) and class-score similarity to target detections, and the approach is shown to outperform existing saliency methods on benchmark metrics (EPG, Deletion, Insertion) across multiple detectors on COCO/VOC, with improved robustness to parameter choices. The work demonstrates practical applications such as correcting detections and provides extensive axiom-based evaluations, arguing that BSED is the first XAI for object detection that is generalizable and quantitatively expresses contributions to predictions.

Abstract

Explainable artificial intelligence (XAI) has witnessed significant advances in the field of object recognition, with saliency maps being used to highlight image features relevant to the predictions of learned models. Although these advances have made AI-based technology more interpretable to humans, several issues have come to light. Some approaches present explanations irrelevant to predictions, and cannot guarantee the validity of XAI (axioms). In this study, we propose the Baseline Shapley-based Explainable Detector (BSED), which extends the Shapley value to object detection, thereby enhancing the validity of interpretation. The Shapley value can attribute the prediction of a learned model to a baseline feature while satisfying the explainability axioms. The processing cost for the BSED is within the reasonable range, while the original Shapley value is prohibitively computationally expensive. Furthermore, BSED is a generalizable method that can be applied to various detectors in a model-agnostic manner, and interpret various detection targets without fine-grained parameter tuning. These strengths can enable the practical applicability of XAI. We present quantitative and qualitative comparisons with existing methods to demonstrate the superior performance of our method in terms of explanation validity. Moreover, we present some applications, such as correcting detection based on explanations from our method.

BSED: Baseline Shapley-Based Explainable Detector

TL;DR

BSED introduces Baseline Shapley-based Explainable Detector, a model-agnostic XAI method for object detection that enforces axiomatic validity by extending Baseline Shapley to detections. It replaces the prohibitive exact Shapley computation with a multilayer, Monte Carlo-based approximation, reducing complexity from to inferences while generating attribution maps that reflect both positive and negative contributions. The score function combines localization (IoU) and class-score similarity to target detections, and the approach is shown to outperform existing saliency methods on benchmark metrics (EPG, Deletion, Insertion) across multiple detectors on COCO/VOC, with improved robustness to parameter choices. The work demonstrates practical applications such as correcting detections and provides extensive axiom-based evaluations, arguing that BSED is the first XAI for object detection that is generalizable and quantitatively expresses contributions to predictions.

Abstract

Explainable artificial intelligence (XAI) has witnessed significant advances in the field of object recognition, with saliency maps being used to highlight image features relevant to the predictions of learned models. Although these advances have made AI-based technology more interpretable to humans, several issues have come to light. Some approaches present explanations irrelevant to predictions, and cannot guarantee the validity of XAI (axioms). In this study, we propose the Baseline Shapley-based Explainable Detector (BSED), which extends the Shapley value to object detection, thereby enhancing the validity of interpretation. The Shapley value can attribute the prediction of a learned model to a baseline feature while satisfying the explainability axioms. The processing cost for the BSED is within the reasonable range, while the original Shapley value is prohibitively computationally expensive. Furthermore, BSED is a generalizable method that can be applied to various detectors in a model-agnostic manner, and interpret various detection targets without fine-grained parameter tuning. These strengths can enable the practical applicability of XAI. We present quantitative and qualitative comparisons with existing methods to demonstrate the superior performance of our method in terms of explanation validity. Moreover, we present some applications, such as correcting detection based on explanations from our method.
Paper Structure (30 sections, 41 equations, 16 figures, 3 tables)

This paper contains 30 sections, 41 equations, 16 figures, 3 tables.

Figures (16)

  • Figure 1: Comparison results with existing methods in interpreting the car detection of YOLOv5s.
  • Figure 2: Saliency maps generated from D-RISE explaining detection results. The parameter $p$ indicates the percentage of the non-masked area in the images for input samplings.
  • Figure 3: Overview of BSED. The explanation target ${\bm d_t}$ is on the input image $X$. The detector $\phi$ obtains perturbated detections ${\bm d_j}$ from the masked image $M^k \odot X$. The similarity between the explanation target and perturbated detections is obtained to calculate the attribution on each pixel. An attribution map $A$ is generated as an explanation for obtaining the target detection.
  • Figure 4: Attribution maps on the explanation for obtaining the target detection of a cat. The explanation results from different object detectors are compared.
  • Figure 5: Scatter plot showing the relationship between $\Delta f$ and $a_p$. The kernel density function was used to indicate high densities in red and low densities in blue.
  • ...and 11 more figures