On the universality of neural encodings in CNNs
Florentin Guth, Brice Ménard
TL;DR
This work investigates whether CNNs trained on different image datasets converge to a universal neural encoding by shifting focus from representations to learned weights. It introduces a space–channel factorization and covariance-based alignment to compare weight encodings across networks, revealing a canonical set of universal spatial eigenvectors and, for natural images, a broadly shared channel-eigenvector structure across layers. The authors develop a framework using Procrustes alignment, eigenvalue shrinkage, and Bures–Wasserstein-based similarity to quantify encodings and demonstrate universality over diverse datasets and tasks, with true-label versus random-label training showing two distinct encoding regimes. The findings provide a principled basis for understanding transfer learning and foundation-model-style universality, suggesting that part of deep learning success stems from universal encodings that can be preset or shared across architectures, reducing the need to learn these components anew.
Abstract
We explore the universality of neural encodings in convolutional neural networks trained on image classification tasks. We develop a procedure to directly compare the learned weights rather than their representations. It is based on a factorization of spatial and channel dimensions and measures the similarity of aligned weight covariances. We show that, for a range of layers of VGG-type networks, the learned eigenvectors appear to be universal across different natural image datasets. Our results suggest the existence of a universal neural encoding for natural images. They explain, at a more fundamental level, the success of transfer learning. Our work shows that, instead of aiming at maximizing the performance of neural networks, one can alternatively attempt to maximize the universality of the learned encoding, in order to build a principled foundation model.
