Integrated feature analysis for deep learning interpretation and class activation maps

Yanli Li; Tahereh Hassanzadeh; Denis P. Shamonin; Monique Reijnierse; Annette H. M. van der Helm-van Mil; Berend C. Stoel

Integrated feature analysis for deep learning interpretation and class activation maps

Yanli Li, Tahereh Hassanzadeh, Denis P. Shamonin, Monique Reijnierse, Annette H. M. van der Helm-van Mil, Berend C. Stoel

TL;DR

An integrated feature analysis method, which consists of feature distribution analysis and feature decomposition, to look closer into the intermediate features extracted by DL models and provide distribution information to form a common intensity scale, which are missing in current CAM algorithms.

Abstract

Understanding the decisions of deep learning (DL) models is essential for the acceptance of DL to risk-sensitive applications. Although methods, like class activation maps (CAMs), give a glimpse into the black box, they do miss some crucial information, thereby limiting its interpretability and merely providing the considered locations of objects. To provide more insight into the models and the influence of datasets, we propose an integrated feature analysis method, which consists of feature distribution analysis and feature decomposition, to look closer into the intermediate features extracted by DL models. This integrated feature analysis could provide information on overfitting, confounders, outliers in datasets, model redundancies and principal features extracted by the models, and provide distribution information to form a common intensity scale, which are missing in current CAM algorithms. The integrated feature analysis was applied to eight different datasets for general validation: photographs of handwritten digits, two datasets of natural images and five medical datasets, including skin photography, ultrasound, CT, X-rays and MRIs. The method was evaluated by calculating the consistency between the CAMs average class activation levels and the logits of the model. Based on the eight datasets, the correlation coefficients through our method were all very close to 100%, and based on the feature decomposition, 5%-25% of features could generate equally informative saliency maps and obtain the same model performances as using all features. This proves the reliability of the integrated feature analysis. As the proposed methods rely on very few assumptions, this is a step towards better model interpretation and a useful extension to existing CAM algorithms. Codes: https://github.com/YanliLi27/IFA

Integrated feature analysis for deep learning interpretation and class activation maps

TL;DR

Abstract

Paper Structure (20 sections, 5 equations, 11 figures, 4 tables)

This paper contains 20 sections, 5 equations, 11 figures, 4 tables.

Introduction
Method
Distribution analysis
Feature decomposition
Evaluation
Object localization
Consistency with the model’s logits
Accuracy changes after feature masking
Materials
Datasets and models
CAMs involved
Experiments and results
Visual checks of the S-CAMs and FS-S-CAMs
Quantitative evaluation: the common intensity scaling
Quantitative evaluation: the selected principal features
...and 5 more sections

Figures (11)

Figure 1: Summary of the workflow, terms and formula involved in generating CAMs. The flowchart at the top shows the general DL workflow, including (1) inputting the input to the feature extraction module to get the features, (2) feeding the features to the classifier to get the logits, (3) with the activation function like softmax to get the confidence, and (4) based on the confidence to get the output class and compare with the ground truth true class. The formula indicates the process of CAM algorithms to obtain features, the source of weights for the CAM calculation, the size for spatial scaling and the two-step intensity scaling that discards the negative values and normalizes the input based on the input itself. The table at the bottom gives a summary of the weight calculations of the existing CAM methods and the chosen features. Red boxes highlight where information is lost or limited in current algorithms. Underlined terms indicate the following context, including input, features, logits, confidence, output class, true class, the selected class for CAM generation, class activation for describing the class activation level of CAM and weighted features. In existing class activation mapping algorithms, the selected class is set by default to the output class.
Figure 2: Consequences of omitting scale information. (a) CAMs fails to match the true class. In the CAMs for the selected class ‘dog’, an image of a ‘cat’ receives higher class activations than a correctly classified image of ‘dog’. In addition, unknown inputs still receive considerable class activations in the CAMs by a model that can distinguish cats and dogs with a 99% accuracy; (b) CAMs failed to match the confidence in the models trained for an MRI-based classification task and ImageNet.
Figure 3: Process of integrated feature analysis that collects information about the distribution of features across the dataset and computes the importance matrix for feature decomposition. The x-axis in the importance matrix represents different classes, while the y-axis represents different features at the selected level of a model. The box presents some applications of the importance matrix, including detecting overfitting, determining principal features, finding potential confounders, locating special cases, and evaluating model redundancies by analyzing the class activations in the importance matrix. Further details are provided in the Discussion section.
Figure 4: CAM generation with integrated feature analysis: (a) distribution information to improve the intensity scaling. After the intensity scaling, the cat in the image gains a lower contribution compared to the background, which is intuitive for a cat-dog classification model; (b) importance matrix to obtain CAMs for specific purposes. The bottom images present CAMs with the purpose of excluding irrelevant features and selecting principal features.
Figure 5: Some examples from the image datasets used. Images from the top left to the bottom right are: a natural image from 1000 classes, a cat image from two-class Cats&Dogs, a digit, an MRI, a CT, an X-ray, an ultrasound image and a skin photograph.
...and 6 more figures

Integrated feature analysis for deep learning interpretation and class activation maps

TL;DR

Abstract

Integrated feature analysis for deep learning interpretation and class activation maps

Authors

TL;DR

Abstract

Table of Contents

Figures (11)