Guidelines For The Choice Of The Baseline in XAI Attribution Methods
Cristian Morasso, Giorgio Dolci, Ilaria Boscolo Galazzo, Sergey M. Plis, Gloria Menegaz
TL;DR
This paper tackles the fragility of attribution maps produced by baseline-guided XAI methods by analyzing how baseline choice affects explanations and proposing a practical solution. It introduces Informed Baseline Search (IBS), a DB-guided sampling algorithm that identifies the orthogonal projection of a sample onto the inner DB to serve as the optimal baseline for Integrated Gradients. Through synthetic experiments and comparisons with SplineCAM and DeepView, IBS demonstrates consistent DB localization and improved attribution faithfulness when the optimal BL is used, while highlighting the limitations of suboptimal baselines. The work provides clear guidelines for baseline selection, a GPU-friendly implementation, and a roadmap for extending the approach to broader BAMs and more realistic data, with significant implications for reliability and interpretability in biomedical AI applications.
Abstract
Given the broad adoption of artificial intelligence, it is essential to provide evidence that AI models are reliable, trustable, and fair. To this end, the emerging field of eXplainable AI develops techniques to probe such requirements, counterbalancing the hype pushing the pervasiveness of this technology. Among the many facets of this issue, this paper focuses on baseline attribution methods, aiming at deriving a feature attribution map at the network input relying on a "neutral" stimulus usually called "baseline". The choice of the baseline is crucial as it determines the explanation of the network behavior. In this framework, this paper has the twofold goal of shedding light on the implications of the choice of the baseline and providing a simple yet effective method for identifying the best baseline for the task. To achieve this, we propose a decision boundary sampling method, since the baseline, by definition, lies on the decision boundary, which naturally becomes the search domain. Experiments are performed on synthetic examples and validated relying on state-of-the-art methods. Despite being limited to the experimental scope, this contribution is relevant as it offers clear guidelines and a simple proxy for baseline selection, reducing ambiguity and enhancing deep models' reliability and trust.
