Understanding the Role of Pathways in a Deep Neural Network
Lei Lyu, Chen Pang, Jihua Wang
TL;DR
This paper introduces diffusion pathways to interpret deep CNNs by tracing how individual input pixels propagate through the network to reach key feature maps. It presents an algorithm using diffusion kernels rotated by 180 degrees and masking strategies to extract pixel-level pathways and assemble portion-hot representations across a VGG-16 classifier trained on CIFAR, MNIST, and M2NIST. The findings show that a small set of large pathways per pixel cross the important maps, that pathway patterns are category-consistent in early layers and distinguishable across categories, and that diffusion pathways illuminate adversarial attacks, occlusion, and object movement. Collectively, the pathway framework provides a fine-grained, part-based view of internal representations, offering practical insights for explainability and robustness in DNNs and potentially shedding light on BNN-like information processing.
Abstract
Deep neural networks have demonstrated superior performance in artificial intelligence applications, but the opaqueness of their inner working mechanism is one major drawback in their application. The prevailing unit-based interpretation is a statistical observation of stimulus-response data, which fails to show a detailed internal process of inherent mechanisms of neural networks. In this work, we analyze a convolutional neural network (CNN) trained in the classification task and present an algorithm to extract the diffusion pathways of individual pixels to identify the locations of pixels in an input image associated with object classes. The pathways allow us to test the causal components which are important for classification and the pathway-based representations are clearly distinguishable between categories. We find that the few largest pathways of an individual pixel from an image tend to cross the feature maps in each layer that is important for classification. And the large pathways of images of the same category are more consistent in their trends than those of different categories. We also apply the pathways to understanding adversarial attacks, object completion, and movement perception. Further, the total number of pathways on feature maps in all layers can clearly discriminate the original, deformed, and target samples.
