Table of Contents
Fetching ...

Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

Luisa M Zintgraf, Taco S Cohen, Tameem Adel, Max Welling

TL;DR

The paper tackles the challenge of interpreting deep neural network decisions for image classification, including medical imaging. It introduces Prediction Difference Analysis with three key enhancements: conditional sampling, multivariate patch analysis, and deep visualization of hidden layers. These improvements yield image-specific, evidence-for/evidence-against explanations that are more faithful than prior saliency methods or weight-based approaches, demonstrated on ImageNet models and HIV-related MRI data. While computationally intensive, the authors argue that precomputation and GPU acceleration can enable practical, interactive 3D visualizations with potential clinical impact.

Abstract

This article presents the prediction difference analysis method for visualizing the response of a deep neural network to a specific input. When classifying images, the method highlights areas in a given input image that provide evidence for or against a certain class. It overcomes several shortcoming of previous methods and provides great additional insight into the decision making process of classifiers. Making neural network decisions interpretable through visualization is important both to improve models and to accelerate the adoption of black-box classifiers in application areas such as medicine. We illustrate the method in experiments on natural images (ImageNet data), as well as medical images (MRI brain scans).

Visualizing Deep Neural Network Decisions: Prediction Difference Analysis

TL;DR

The paper tackles the challenge of interpreting deep neural network decisions for image classification, including medical imaging. It introduces Prediction Difference Analysis with three key enhancements: conditional sampling, multivariate patch analysis, and deep visualization of hidden layers. These improvements yield image-specific, evidence-for/evidence-against explanations that are more faithful than prior saliency methods or weight-based approaches, demonstrated on ImageNet models and HIV-related MRI data. While computationally intensive, the authors argue that precomputation and GPU acceleration can enable practical, interactive 3D visualizations with potential clinical impact.

Abstract

This article presents the prediction difference analysis method for visualizing the response of a deep neural network to a specific input. When classifying images, the method highlights areas in a given input image that provide evidence for or against a certain class. It overcomes several shortcoming of previous methods and provides great additional insight into the decision making process of classifiers. Making neural network decisions interpretable through visualization is important both to improve models and to accelerate the adoption of black-box classifiers in application areas such as medicine. We illustrate the method in experiments on natural images (ImageNet data), as well as medical images (MRI brain scans).

Paper Structure

This paper contains 12 sections, 5 equations, 13 figures, 1 algorithm.

Figures (13)

  • Figure 1: Example of our visualization method: explains why the DCNN (GoogLeNet) predicts "cockatoo". Shown is the evidence for (red) and against (blue) the prediction. We see that the facial features of the cockatoo are most supportive for the decision, and parts of the body seem to constitute evidence against it. In fact, the classifier most likely considers them evidence for the second-highest scoring class, white wolf.
  • Figure 2: Simple illustration of the sampling procedure in algorithm \ref{['alg:pred-diff']}. Given the input image $x$, we select every possible patch $x_w$ (in a sliding window fashion) of size $k\times k$ and place a larger patch $\hat{x}_w$ of size $l\times l$ around it. We can then conditionally sample $x_w$ by conditioning on the surrounding patch $\hat{x}_w$.
  • Figure 3: Visualization of the effects of marginal versus conditional sampling using the GoogLeNet classifier. The classifier makes correct predictions (ostrich and saxophone), and we show the evidence for (red) and against (blue) this decision at the output layer. We can see that conditional sampling gives more targeted explanations compared to marginal sampling. Also, marginal sampling assigns too much importance on pixels that are easily predictable conditioned on their neighboring pixels.
  • Figure 4: Visualization of how different window sizes influence the visualization result. We used the conditional sampling method and the AlexNet classifier with $l=k+4$ and varying $k$. We can see that even when removing single pixels ($k=1$), this has a noticeable effect on the classifier and more important pixels get a higher score. By increasing the window size we can get a more easily interpretable, smooth result until the image gets blurry for very large window sizes.
  • Figure 5: Visualization of feature maps from thee different layers of the GoogLeNet (l.t.r.: "conv1/7x7_s2", "inception_3a/output", "inception_5b/output"), using conditional sampling and patch sizes $k=10$ and $l=14$ (see alg. \ref{['alg:pred-diff']}). For each feature map in the convolutional layer, we first evaluate the relevance for every single unit, and then average the results over all the units in one feature map to get a sense of what the unit is doing as a whole. Red pixels activate a unit, blue pixels decreased the activation.
  • ...and 8 more figures