Brain Decodes Deep Nets
Huzheng Yang, James Gee, Jianbo Shi
TL;DR
FactorTopy introduces a topology-constrained, factorized brain-to-network mapping that visualizes how deep networks organize computation relative to the brain by predicting fMRI responses from image features. The method maps 4D network features across space, layer, scale, and channel to brain voxels, revealing how training objectives and data shape hierarchical alignment. Key findings show CLIP achieves the strongest brain-hierarchy alignment, which improves with scale, while many other models lose alignment as capacity grows; fine-tuning on small datasets tends to rewire networks, with CLIP showing greater robustness. The work provides a brain-informed lens for diagnosing model behavior and offers a visualization toolkit for interpreting deep networks through their brain-inspired organization.
Abstract
We developed a tool for visualizing and analyzing large pre-trained vision models by mapping them onto the brain, thus exposing their hidden inside. Our innovation arises from a surprising usage of brain encoding: predicting brain fMRI measurements in response to images. We report two findings. First, explicit mapping between the brain and deep-network features across dimensions of space, layers, scales, and channels is crucial. This mapping method, FactorTopy, is plug-and-play for any deep-network; with it, one can paint a picture of the network onto the brain (literally!). Second, our visualization shows how different training methods matter: they lead to remarkable differences in hierarchical organization and scaling behavior, growing with more data or network capacity. It also provides insight into fine-tuning: how pre-trained models change when adapting to small datasets. We found brain-like hierarchically organized network suffer less from catastrophic forgetting after fine-tuned.
