A Taxonomy and Library for Visualizing Learned Features in Convolutional Neural Networks
Felix Grün, Christian Rupprecht, Nassir Navab, Federico Tombari
TL;DR
Addressing the interpretability gap in CNNs, the paper proposes a unifying taxonomy that partitions feature-visualization methods into Input Modification, Deconvolutional, and Input Reconstruction categories, clarifying goals and algorithms. It also introduces the FeatureVis library built on MatConvNet, providing open-source implementations of methods from all three classes to facilitate experimentation and cross-method comparison. Through qualitative comparisons and architectural analyses, the work demonstrates how visualizations reveal learned intermediate representations and help explain differences in network performance. By standardizing terminology and providing tooling, the paper offers a practical foundation for interpretability research and future method development.
Abstract
Over the last decade, Convolutional Neural Networks (CNN) saw a tremendous surge in performance. However, understanding what a network has learned still proves to be a challenging task. To remedy this unsatisfactory situation, a number of groups have recently proposed different methods to visualize the learned models. In this work we suggest a general taxonomy to classify and compare these methods, subdividing the literature into three main categories and providing researchers with a terminology to base their works on. Furthermore, we introduce the FeatureVis library for MatConvNet: an extendable, easy to use open source library for visualizing CNNs. It contains implementations from each of the three main classes of visualization methods and serves as a useful tool for an enhanced understanding of the features learned by intermediate layers, as well as for the analysis of why a network might fail for certain examples.
