BSED: Baseline Shapley-Based Explainable Detector
Michihiro Kuroki, Toshihiko Yamasaki
TL;DR
BSED introduces Baseline Shapley-based Explainable Detector, a model-agnostic XAI method for object detection that enforces axiomatic validity by extending Baseline Shapley to detections. It replaces the prohibitive exact Shapley computation with a multilayer, Monte Carlo-based approximation, reducing complexity from $O(2^{|N_f|})$ to $O(N)$ inferences while generating attribution maps that reflect both positive and negative contributions. The score function combines localization (IoU) and class-score similarity to target detections, and the approach is shown to outperform existing saliency methods on benchmark metrics (EPG, Deletion, Insertion) across multiple detectors on COCO/VOC, with improved robustness to parameter choices. The work demonstrates practical applications such as correcting detections and provides extensive axiom-based evaluations, arguing that BSED is the first XAI for object detection that is generalizable and quantitatively expresses contributions to predictions.
Abstract
Explainable artificial intelligence (XAI) has witnessed significant advances in the field of object recognition, with saliency maps being used to highlight image features relevant to the predictions of learned models. Although these advances have made AI-based technology more interpretable to humans, several issues have come to light. Some approaches present explanations irrelevant to predictions, and cannot guarantee the validity of XAI (axioms). In this study, we propose the Baseline Shapley-based Explainable Detector (BSED), which extends the Shapley value to object detection, thereby enhancing the validity of interpretation. The Shapley value can attribute the prediction of a learned model to a baseline feature while satisfying the explainability axioms. The processing cost for the BSED is within the reasonable range, while the original Shapley value is prohibitively computationally expensive. Furthermore, BSED is a generalizable method that can be applied to various detectors in a model-agnostic manner, and interpret various detection targets without fine-grained parameter tuning. These strengths can enable the practical applicability of XAI. We present quantitative and qualitative comparisons with existing methods to demonstrate the superior performance of our method in terms of explanation validity. Moreover, we present some applications, such as correcting detection based on explanations from our method.
