Exploring specialization and sensitivity of convolutional neural networks in the context of simultaneous image augmentations
Pavel Kharyuk, Sergey Matveev, Ivan Oseledets
TL;DR
This work introduces a framework to analyze how CNN internal activations respond when multiple input augmentations are applied simultaneously, leveraging Sobol indices and Shapley values to decompose activation variance by augmentation factors and their interactions. It combines full-scale sensitivity analysis with guided masking and single-channel segment studies across AlexNet, VGG11, and ResNet18 on ILSVRC and Places365, revealing depth-dependent specialization and robust augmentation sensitivity, including consistent large effects from grayscale, erasing, and hue distortions. The approach yields activation maps, correlation patterns, and linear discriminant analyses that validate the sensitivity findings and enable targeted masking to probe prediction biases, with potential extensions to biological neural network studies. Overall, the framework contributes a principled, quantitative method to understand and potentially enhance robustness of deep CNNs to complex data distortions, and it could inform strategies for fault-tolerant architectures and interpretable AI.
Abstract
Drawing parallels with the way biological networks are studied, we adapt the treatment--control paradigm to explainable artificial intelligence research and enrich it through multi-parametric input alterations. In this study, we propose a framework for investigating the internal inference impacted by input data augmentations. The internal changes in network operation are reflected in activation changes measured by variance, which can be decomposed into components related to each augmentation, employing Sobol indices and Shapley values. These quantities enable one to visualize sensitivity to different variables and use them for guided masking of activations. In addition, we introduce a way of single-class sensitivity analysis where the candidates are filtered according to their matching to prediction bias generated by targeted damaging of the activations. Relying on the observed parallels, we assume that the developed framework can potentially be transferred to studying biological neural networks in complex environments.
