Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability
Joakim Edin, Andreas Geert Motzfeldt, Casper L. Christensen, Tuukka Ruotsalo, Lars Maaløe, Maria Maistro
TL;DR
The paper argues that AOPC-based faithfulness scores are inherently model-dependent due to varying lower and upper AOPC limits across models and inputs, making cross-model comparisons unreliable. It introduces Normalized AOPC (NAOPC) with exact and beam-approximate variants to align these limits and enable meaningful cross-model evaluation of feature attribution methods. Empirical results across five datasets and multiple architectures show that NAOPC can substantially alter model rankings of faithfulness, challenging conclusions drawn from unnormalized AOPC. The authors provide two implementations (NAOPC_exact and NAOPC_beam), release a PyPI package, and discuss practical guidance on when normalization is necessary and how to manage computational costs.
Abstract
Deep neural network predictions are notoriously difficult to interpret. Feature attribution methods aim to explain these predictions by identifying the contribution of each input feature. Faithfulness, often evaluated using the area over the perturbation curve (AOPC), reflects feature attributions' accuracy in describing the internal mechanisms of deep neural networks. However, many studies rely on AOPC to compare faithfulness across different models, which we show can lead to false conclusions about models' faithfulness. Specifically, we find that AOPC is sensitive to variations in the model, resulting in unreliable cross-model comparisons. Moreover, AOPC scores are difficult to interpret in isolation without knowing the model-specific lower and upper limits. To address these issues, we propose a normalization approach, Normalized AOPC (NAOPC), enabling consistent cross-model evaluations and more meaningful interpretation of individual scores. Our experiments demonstrate that this normalization can radically change AOPC results, questioning the conclusions of earlier studies and offering a more robust framework for assessing feature attribution faithfulness. Our code is available at https://github.com/JoakimEdin/naopc.
