Table of Contents
Fetching ...

Investigating the influence of noise and distractors on the interpretation of neural networks

Pieter-Jan Kindermans, Kristof Schütt, Klaus-Robert Müller, Sven Dähne

TL;DR

The paper investigates how noise and distractors affect neural-network explanations and argues that existing gradient-based methods may be unreliable in noisy settings. It formalizes a generative model $x = \boldsymbol{a}_{t}s_{t} + A_{n}\boldsymbol{s}_{n}^{T} + \boldsymbol{\epsilon}$ and shows that any linear explanation must satisfy $\boldsymbol{w}^{T}\boldsymbol{a}_{t}=1$ and $\boldsymbol{w}^{T}A_{n}=0$ to recover $s_{t}$, highlighting the importance of ignoring task-irrelevant directions. Through a deep Taylor decomposition lens, it analyzes root-point choices and existing rules (e.g., the $z$-rule, $w^2$-rule, $a$-rule) and proposes two new robust rules ($w^+$ and $a^+$) that align explanations with task-related variation. Empirical results on MNIST with an MLP show how different rules allocate relevance under noise, underscoring the need for principled rule selection and broader benchmarks for explanation methods.

Abstract

Understanding neural networks is becoming increasingly important. Over the last few years different types of visualisation and explanation methods have been proposed. However, none of them explicitly considered the behaviour in the presence of noise and distracting elements. In this work, we will show how noise and distracting dimensions can influence the result of an explanation model. This gives a new theoretical insights to aid selection of the most appropriate explanation model within the deep-Taylor decomposition framework.

Investigating the influence of noise and distractors on the interpretation of neural networks

TL;DR

The paper investigates how noise and distractors affect neural-network explanations and argues that existing gradient-based methods may be unreliable in noisy settings. It formalizes a generative model and shows that any linear explanation must satisfy and to recover , highlighting the importance of ignoring task-irrelevant directions. Through a deep Taylor decomposition lens, it analyzes root-point choices and existing rules (e.g., the -rule, -rule, -rule) and proposes two new robust rules ( and ) that align explanations with task-related variation. Empirical results on MNIST with an MLP show how different rules allocate relevance under noise, underscoring the need for principled rule selection and broader benchmarks for explanation methods.

Abstract

Understanding neural networks is becoming increasingly important. Over the last few years different types of visualisation and explanation methods have been proposed. However, none of them explicitly considered the behaviour in the presence of noise and distracting elements. In this work, we will show how noise and distracting dimensions can influence the result of an explanation model. This gives a new theoretical insights to aid selection of the most appropriate explanation model within the deep-Taylor decomposition framework.

Paper Structure

This paper contains 8 sections, 6 equations, 2 figures.

Figures (2)

  • Figure 1: Comparison of the behaviour of various explanation methods under the influence of noise. See text for details.
  • Figure 2: Comparison of heatmaps on different MNIST digits with the noise level set to 0.2. See text for details.