Incorporating Attribution Importance for Improving Faithfulness Metrics

Zhixue Zhao; Nikolaos Aletras

Incorporating Attribution Importance for Improving Faithfulness Metrics

Zhixue Zhao, Nikolaos Aletras

TL;DR

This paper tackles the problem of faithfulness evaluation for feature attributions in NLP by showing that hard erasure can misrepresent token importance and degrade robustness to distribution shifts. It introduces Soft Normalized Sufficiency (Soft-NS) and Soft Normalized Comprehensiveness (Soft-NC), which perturb inputs at the embedding level using FA-guided dropout rather than full token deletion. The authors formalize these metrics, provide a comprehensive experimental study across SST, AG, Ev.Inf., and MultiRC using multiple FAs, and demonstrate that Soft-NS and Soft-NC yield higher diagnosticity and more faithful explanations than traditional NS/NC and hard perturbations. The work advances practical faithfulness evaluation and offers code for reproducibility, with implications for more reliable explanation methods in real-world NLP tasks.

Abstract

Feature attribution methods (FAs) are popular approaches for providing insights into the model reasoning process of making predictions. The more faithful a FA is, the more accurately it reflects which parts of the input are more important for the prediction. Widely used faithfulness metrics, such as sufficiency and comprehensiveness use a hard erasure criterion, i.e. entirely removing or retaining the top most important tokens ranked by a given FA and observing the changes in predictive likelihood. However, this hard criterion ignores the importance of each individual token, treating them all equally for computing sufficiency and comprehensiveness. In this paper, we propose a simple yet effective soft erasure criterion. Instead of entirely removing or retaining tokens from the input, we randomly mask parts of the token vector representations proportionately to their FA importance. Extensive experiments across various natural language processing tasks and different FAs show that our soft-sufficiency and soft-comprehensiveness metrics consistently prefer more faithful explanations compared to hard sufficiency and comprehensiveness. Our code: https://github.com/casszhao/SoftFaith

Incorporating Attribution Importance for Improving Faithfulness Metrics

TL;DR

Abstract

Paper Structure (35 sections, 7 equations, 3 figures, 6 tables)

This paper contains 35 sections, 7 equations, 3 figures, 6 tables.

Introduction
Related Work
Feature Attribution Methods
Measuring Faithfulness
Evaluating Faithfulness Metrics
Faithfulness Evaluation Metrics
Sufficiency and Comprehensiveness
Normalized Sufficiency (NS):
Normalized Comprehensiveness (NC):
Soft Nomralized Sufficiency and Comprehensiveness
Soft Input Perturbation:
Soft Normalized Sufficiency (Soft-NS):
Soft Normalized Comprehensiveness (Soft-NC):
Experimental Setup
Tasks
...and 20 more sections

Figures (3)

Figure 1: Hard and soft erasure criteria for comprehensiveness and sufficiency for two toy feature attribution (FA) methods A and B.
Figure 2: The impact of rationale length on normalized comprehensiveness (NC) and sufficiency (NS). Each line represents a FA.
Figure 3: The impact of rationale length (shown in ratio) on Diagnosticity scores.

Incorporating Attribution Importance for Improving Faithfulness Metrics

TL;DR

Abstract

Incorporating Attribution Importance for Improving Faithfulness Metrics

Authors

TL;DR

Abstract

Table of Contents

Figures (3)