Table of Contents
Fetching ...

Illuminating Salient Contributions in Neuron Activation with Attribution Equilibrium

Woo-Jeoung Nam, Seong-Whan Lee

TL;DR

This paper introduces Attribution Equilibrium, a novel method to decompose output predictions into fine-grained attributions, balancing positive and negative relevance for clearer visualization of the evidence behind a network decision.

Abstract

With the remarkable success of deep neural networks, there is a growing interest in research aimed at providing clear interpretations of their decision-making processes. In this paper, we introduce Attribution Equilibrium, a novel method to decompose output predictions into fine-grained attributions, balancing positive and negative relevance for clearer visualization of the evidence behind a network decision. We carefully analyze conventional approaches to decision explanation and present a different perspective on the conservation of evidence. We define the evidence as a gap between positive and negative influences among gradient-derived initial contribution maps. Then, we incorporate antagonistic elements and a user-defined criterion for the degree of positive attribution during propagation. Additionally, we consider the role of inactivated neurons in the propagation rule, thereby enhancing the discernment of less relevant elements such as the background. We conduct various assessments in a verified experimental environment with PASCAL VOC 2007, MS COCO 2014, and ImageNet datasets. The results demonstrate that our method outperforms existing attribution methods both qualitatively and quantitatively in identifying the key input features that influence model decisions.

Illuminating Salient Contributions in Neuron Activation with Attribution Equilibrium

TL;DR

This paper introduces Attribution Equilibrium, a novel method to decompose output predictions into fine-grained attributions, balancing positive and negative relevance for clearer visualization of the evidence behind a network decision.

Abstract

With the remarkable success of deep neural networks, there is a growing interest in research aimed at providing clear interpretations of their decision-making processes. In this paper, we introduce Attribution Equilibrium, a novel method to decompose output predictions into fine-grained attributions, balancing positive and negative relevance for clearer visualization of the evidence behind a network decision. We carefully analyze conventional approaches to decision explanation and present a different perspective on the conservation of evidence. We define the evidence as a gap between positive and negative influences among gradient-derived initial contribution maps. Then, we incorporate antagonistic elements and a user-defined criterion for the degree of positive attribution during propagation. Additionally, we consider the role of inactivated neurons in the propagation rule, thereby enhancing the discernment of less relevant elements such as the background. We conduct various assessments in a verified experimental environment with PASCAL VOC 2007, MS COCO 2014, and ImageNet datasets. The results demonstrate that our method outperforms existing attribution methods both qualitatively and quantitatively in identifying the key input features that influence model decisions.
Paper Structure (20 sections, 14 equations, 11 figures, 3 tables, 1 algorithm)

This paper contains 20 sections, 14 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: Our method aims to provide sophisticated visualizations of the significant input features related to network decision, effectively addressing challenges related to detail deficiency, class-discriminability, and clarity of object representation.
  • Figure 2: Images in the First row represent the input, initial attribution $R$, and neurons to be assigned the relevance, respectively. The second row shows the issue when propagating the relevance according to the actual contributed value. The third row shows our view of influence on propagation and the criterion of evidence that overwhelms the negative influence. Detailed explanation is given in Section 3.2.
  • Figure 3: Difference between channel attributions of intermediate layers with/without considering activation properties. First row shows global input features of intermediate layers. Each row below visualizes channel-wise sum of attributions based on variants.
  • Figure 4: Comparison of conventional and proposed attribution methods applied to VGG-16 on PASCAL VOC dataset. Class names on left side represent predicted labels of input image. Upper and bottom groups show attributions for predictions of single- and multi-label images. Red and blue colors represent positive and negative values, respectively.
  • Figure 5: Applications of our method to various models: AlexNet, VGG-16, ResNet-50, Inception-V3, and DenseNet-121 on ImageNet validation dataset.
  • ...and 6 more figures