Table of Contents
Fetching ...

Discovering Influential Neuron Path in Vision Transformers

Yifan Wang, Yifei Liu, Yingdong Shi, Changming Li, Anqi Pang, Sibei Yang, Jingyi Yu, Kan Ren

TL;DR

Vision Transformers are powerful but opaque; the paper introduces a Neuron Path framework to interpret internal information flow by identifying influential neuron paths across FFN layers. The core methodology combines a Joint Attribution Score (JAS) with a layer-progressive locating algorithm to select a path that maximizes joint influence on model output. Empirical results on ViT-B-16, ViT-B-32, ViT-L-32 and MAE-B-16 show the method outperforms baselines, reveals intra-class clustering and semantic similarities in neuron usage, and enables pruning by preserving a sparse set of critical neurons with minimal loss. This work advances explainability and suggests practical pruning strategies for Vision Transformers, while noting its FFN-centric scope and inviting extensions to full Transformer blocks and other vision tasks.

Abstract

Vision Transformer models exhibit immense power yet remain opaque to human understanding, posing challenges and risks for practical applications. While prior research has attempted to demystify these models through input attribution and neuron role analysis, there's been a notable gap in considering layer-level information and the holistic path of information flow across layers. In this paper, we investigate the significance of influential neuron paths within vision Transformers, which is a path of neurons from the model input to output that impacts the model inference most significantly. We first propose a joint influence measure to assess the contribution of a set of neurons to the model outcome. And we further provide a layer-progressive neuron locating approach that efficiently selects the most influential neuron at each layer trying to discover the crucial neuron path from input to output within the target model. Our experiments demonstrate the superiority of our method finding the most influential neuron path along which the information flows, over the existing baseline solutions. Additionally, the neuron paths have illustrated that vision Transformers exhibit some specific inner working mechanism for processing the visual information within the same image category. We further analyze the key effects of these neurons on the image classification task, showcasing that the found neuron paths have already preserved the model capability on downstream tasks, which may also shed some lights on real-world applications like model pruning. The project website including implementation code is available at https://foundation-model-research.github.io/NeuronPath/.

Discovering Influential Neuron Path in Vision Transformers

TL;DR

Vision Transformers are powerful but opaque; the paper introduces a Neuron Path framework to interpret internal information flow by identifying influential neuron paths across FFN layers. The core methodology combines a Joint Attribution Score (JAS) with a layer-progressive locating algorithm to select a path that maximizes joint influence on model output. Empirical results on ViT-B-16, ViT-B-32, ViT-L-32 and MAE-B-16 show the method outperforms baselines, reveals intra-class clustering and semantic similarities in neuron usage, and enables pruning by preserving a sparse set of critical neurons with minimal loss. This work advances explainability and suggests practical pruning strategies for Vision Transformers, while noting its FFN-centric scope and inviting extensions to full Transformer blocks and other vision tasks.

Abstract

Vision Transformer models exhibit immense power yet remain opaque to human understanding, posing challenges and risks for practical applications. While prior research has attempted to demystify these models through input attribution and neuron role analysis, there's been a notable gap in considering layer-level information and the holistic path of information flow across layers. In this paper, we investigate the significance of influential neuron paths within vision Transformers, which is a path of neurons from the model input to output that impacts the model inference most significantly. We first propose a joint influence measure to assess the contribution of a set of neurons to the model outcome. And we further provide a layer-progressive neuron locating approach that efficiently selects the most influential neuron at each layer trying to discover the crucial neuron path from input to output within the target model. Our experiments demonstrate the superiority of our method finding the most influential neuron path along which the information flows, over the existing baseline solutions. Additionally, the neuron paths have illustrated that vision Transformers exhibit some specific inner working mechanism for processing the visual information within the same image category. We further analyze the key effects of these neurons on the image classification task, showcasing that the found neuron paths have already preserved the model capability on downstream tasks, which may also shed some lights on real-world applications like model pruning. The project website including implementation code is available at https://foundation-model-research.github.io/NeuronPath/.

Paper Structure

This paper contains 32 sections, 23 equations, 14 figures, 3 tables, 3 algorithms.

Figures (14)

  • Figure 1: The illustration of the main concept of our work, focusing on the feed-forward network (FFN) component within a standard ViT dosovitskiy2021an encoder. In the left part, a typical ViT encoder is depicted, consisting of totally $L$ Transformer layers. The right part illustrates the neuron path discovered by our method, which identifies a path comprising of the neurons within the FFN module across the model layers. Each FFN in the encoder is denoted as FFN$^l$, $l \in [1,L]$.
  • Figure 2: The distribution of knowledge neurons in two different pretrained vision Transformer models. It can be noticed that comparing ViT-B-16 dosovitskiy2021an and MAE-B-16 he2022masked, their neuron attribution show completely opposite distributions across layers.
  • Figure 3: The relative deviation in the model's predicted probability of the ground-truth label when the value of neurons selected by different methods is either removed (zeroed out) or enhanced (doubled).
  • Figure 4: The frequency of each neuron at each layer occurred in the discovered neuron paths.
  • Figure 5: Examples of category similarity analysis. Using ViT-B-16 as target model, we randomly select three categories and calculate the similarity with others using the neuron utilization matrices and sample the top 5% and bottom 5% similar items. Through visualization we can see that categories with high (low) neuron path similarity tend to be also high (low) in semantic similarity.
  • ...and 9 more figures

Theorems & Definitions (2)

  • Definition 1: Joint Attribution Score
  • Definition 2: Neuron Path