Sanity Checks Revisited: An Exploration to Repair the Model Parameter Randomisation Test

Anna Hedström; Leander Weber; Sebastian Lapuschkin; Marina MC Höhne

Sanity Checks Revisited: An Exploration to Repair the Model Parameter Randomisation Test

Anna Hedström, Leander Weber, Sebastian Lapuschkin, Marina MC Höhne

TL;DR

The paper revisits the Model Parameter Randomisation Test (MPRT) for evaluating explanation faithfulness in XAI and identifies confounds such as top-down randomisation preserving information and noise-sensitive similarity metrics. It introduces sMPRT and eMPRT as remedies: sMPRT denoises attributions via sampling/smoothing, and eMPRT replaces pairwise similarity with an entropy-based attribution complexity measure, quantified through the rate of change under full randomisation. The methods are formalized with definitions for $\\Psi_{MPRT}$, $\\Psi_{sMPRT}$, and $\\Psi_{eMPRT}$ and use $H(\\Phi(\\cdot))$ to capture attribution complexity (with $n=100$ bins) to enable scalable, architecture-agnostic evaluation. Meta-evaluation on diverse datasets and architectures suggests improved reliability and efficiency, providing a more trustworthy framework for comparing XAI explanations in practice by focusing on sensitivity to parameter perturbations rather than noisy similarity metrics.

Abstract

The Model Parameter Randomisation Test (MPRT) is widely acknowledged in the eXplainable Artificial Intelligence (XAI) community for its well-motivated evaluative principle: that the explanation function should be sensitive to changes in the parameters of the model function. However, recent works have identified several methodological caveats for the empirical interpretation of MPRT. To address these caveats, we introduce two adaptations to the original MPRT -- Smooth MPRT and Efficient MPRT, where the former minimises the impact that noise has on the evaluation results through sampling and the latter circumvents the need for biased similarity measurements by re-interpreting the test through the explanation's rise in complexity, after full parameter randomisation. Our experimental results demonstrate that these proposed variants lead to improved metric reliability, thus enabling a more trustworthy application of XAI methods.

Sanity Checks Revisited: An Exploration to Repair the Model Parameter Randomisation Test

TL;DR

, and

and use

to capture attribution complexity (with

bins) to enable scalable, architecture-agnostic evaluation. Meta-evaluation on diverse datasets and architectures suggests improved reliability and efficiency, providing a more trustworthy framework for comparing XAI explanations in practice by focusing on sensitivity to parameter perturbations rather than noisy similarity metrics.

Abstract

Paper Structure (6 sections, 8 equations, 3 figures)

This paper contains 6 sections, 8 equations, 3 figures.

Open Questions / Experiment TODOs
Introduction
From MPRT to sMPRT and eMPRT
Method
Experiments
Appendix

Figures (3)

Figure 1: Schematic visualization of the issues with the MPRT adebayo2018 observed by bindershort (red) and our proposed solutions (blue). (I) Top-down layer-wise randomisation of the model does not completely destroy information; in fact, several properties of the forward signal outputted by the unrandomised part $\phi$ are preserved in the output of the whole model $f$. (a) To solve this issue, we employ bottom-up randomisation instead, as suggested by bindershort. (II) Metrics measuring similarity (or distance) are sensitive to noise, which may affect the ranking results of MPRT to an unknown degree. (b) As an initial solution, we propose to add noise to the model input, and average attributions over several such noisy variations of the same input to extract the approximately noise-free signal, similar to what is proposed by smilkov2017smoothgrad. (c) More efficiently, we propose to completely replace the pair-wise comparison via similarity or distance metrics by a complexity measurement taking singular attributions as input and thus circumvent issue (II).
Figure 2: Caption
Figure 3: A graphical representation of the benchmarking results (Table \ref{['table-benchmarking-imagenet']}), aggregated over 3 iterations with $K=5$ over three datasets {ImageNet, MNIST, fMNIST} and {VGG, ResNet18, LeNet} with $L=${Gradient, GradCAM, LRP$_{\epsilon}$, Guided-Backprop} The first row showcases results for eMPRT and the second row for MPRT. Each column corresponds to a dataset and model combination from left to right: .......... Complexity, Faithfulness, Localisation, Randomisation and Robustness. The grey area indicates the area of an optimally performing estimator, i.e., $\mathbf{m}^{*} = \mathbb{1}^4$. The MC score (indicated in brackets) is averaged over MPT and IPT. Higher values are preferred.

Theorems & Definitions (3)

Definition 1: Model Parameter Randomisation Test
Definition 2: Smooth Model Parameter Randomisation Test
Definition 3: Enhanced Model Parameter Randomisation Test

Sanity Checks Revisited: An Exploration to Repair the Model Parameter Randomisation Test

TL;DR

Abstract

Sanity Checks Revisited: An Exploration to Repair the Model Parameter Randomisation Test

Authors

TL;DR

Abstract

Table of Contents

Figures (3)

Theorems & Definitions (3)