Table of Contents
Fetching ...

On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality Perspective

Junhwa Song, Keumgang Cha, Junghoon Seo

TL;DR

This paper challenges the reliability of ROAR and its variant ROAD as universal benchmarks for feature-importance explanations. By framing attribution informativeness through a data-generation causal model and applying a conditional data-processing inequality, the authors show that post-processing agnostic to the model can unintentionally improve ROAR scores without increasing information about the decision function. They demonstrate, both theoretically with DPI arguments and empirically across CIFAR-10, SVHN, and CUB-200, that blurriness-inducing post-processings and block-based masking can depress ROAR performance even when the underlying attribution provides less model information. The work cautions researchers to beware data-structure biases and calls for broader, more robust perturbation-based evaluation frameworks beyond ROAR.

Abstract

Approaches for appraising feature importance approximations, alternatively referred to as attribution methods, have been established across an extensive array of contexts. The development of resilient techniques for performance benchmarking constitutes a critical concern in the sphere of explainable deep learning. This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure, which is prevalently employed for gauging the performance of feature importance estimates. The insights gleaned from our theoretical foundation and empirical investigations reveal that attributions containing lesser information about the decision function may yield superior results in ROAR benchmarks, contradicting the original intent of ROAR. This occurrence is similarly observed in the recently introduced variant RemOve-And-Debias (ROAD), and we posit a persistent pattern of blurriness bias in ROAR attribution metrics. Our findings serve as a warning against indiscriminate use on ROAR metrics.

On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality Perspective

TL;DR

This paper challenges the reliability of ROAR and its variant ROAD as universal benchmarks for feature-importance explanations. By framing attribution informativeness through a data-generation causal model and applying a conditional data-processing inequality, the authors show that post-processing agnostic to the model can unintentionally improve ROAR scores without increasing information about the decision function. They demonstrate, both theoretically with DPI arguments and empirically across CIFAR-10, SVHN, and CUB-200, that blurriness-inducing post-processings and block-based masking can depress ROAR performance even when the underlying attribution provides less model information. The work cautions researchers to beware data-structure biases and calls for broader, more robust perturbation-based evaluation frameworks beyond ROAR.

Abstract

Approaches for appraising feature importance approximations, alternatively referred to as attribution methods, have been established across an extensive array of contexts. The development of resilient techniques for performance benchmarking constitutes a critical concern in the sphere of explainable deep learning. This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure, which is prevalently employed for gauging the performance of feature importance estimates. The insights gleaned from our theoretical foundation and empirical investigations reveal that attributions containing lesser information about the decision function may yield superior results in ROAR benchmarks, contradicting the original intent of ROAR. This occurrence is similarly observed in the recently introduced variant RemOve-And-Debias (ROAD), and we posit a persistent pattern of blurriness bias in ROAR attribution metrics. Our findings serve as a warning against indiscriminate use on ROAR metrics.
Paper Structure (29 sections, 2 theorems, 17 equations, 10 figures, 1 table, 1 algorithm)

This paper contains 29 sections, 2 theorems, 17 equations, 10 figures, 1 table, 1 algorithm.

Key Result

Theorem 3.1

Assume $\widetilde{A}=k(A,U)$ where $U \perp (X,Y,Z,A)$ and $k$ uses $(A,U)$ only. Then, for the causal graph in Fig. fig:causal, Moreover, for any (possibly randomized) measurable mapping $\psi$, and in particular,

Figures (10)

  • Figure 1: A structural causal model for the generation of modified data variables.
  • Figure 2: Model/data-agnostic attribution post-processings. The term 'Plain' indicates no post-processing, i.e., $k(a)=a$. The leftmost image represents the original input image $x$. The feature importance measure used in this illustration is input-gradient.
  • Figure 3: (a) Input Image, (b) PixelRandom, and (c) BlockRandom. Examples of PixelRandom and BlockRandom.
  • Figure 4: The effect of Gaussian filtering and max-pooling on the ROAR metric. The labels 'P', 'M', and 'G' refer to 'Plain', 'Max-pooling', and 'Gaussian filter', respectively. The numbers at the top of each column indicate the attribution drop rate. For ease of interpretation, the results of max-pooling and Gaussian filtering are expressed as the difference from the plain method. A decrease in model accuracy is indicated by a magenta '$-$' sign, while an increase is represented by a cyan '$+$' sign. For optimal viewing, it is recommended to zoom in.
  • Figure 5: The effect of Gaussian filtering and max-pooling on the ROAD metric. The conventions used in this figure are the same as those in Figure \ref{['fig:effects-of-processing']}, except that this figure relates to the ROAD benchmark instead of the ROAR benchmark.
  • ...and 5 more figures

Theorems & Definitions (5)

  • Theorem 3.1: Conditional data processing for agnostic post-processing
  • proof
  • Theorem 3.2: ROAR can be improved while destroying model/explainer information
  • proof
  • Conjecture 3.3