The Singular Values of Convolutional Layers
Hanie Sedghi, Vineet Gupta, Philip M. Long
TL;DR
This work presents an exact, efficient characterization of the singular values of 2D multi-channel convolutional layers by leveraging the Fourier structure of convolution and circulant matrices. It shows how to compute the full spectrum in $O(n^2 m^2 (m + \log n))$ time via FFTs and SVDs, enabling projection onto an operator-norm ball as a regularizer. The authors demonstrate that operator-norm regularization improves CIFAR-10 performance when used with ResNet architectures, and that it complements batch normalization rather than replacing it. The approach outperforms prior heuristic reshaping methods both in accuracy and computational efficiency, making spectrum-aware regularization practical for deep networks.
Abstract
We characterize the singular values of the linear transformation associated with a standard 2D multi-channel convolutional layer, enabling their efficient computation. This characterization also leads to an algorithm for projecting a convolutional layer onto an operator-norm ball. We show that this is an effective regularizer; for example, it improves the test error of a deep residual network using batch normalization on CIFAR-10 from 6.2\% to 5.3\%.
