Efficient Learning of Convolution Weights as Gaussian Mixture Model Posteriors
Lifan Liang
TL;DR
This paper reframes a convolution layer as computing the unnormalized log posterior of a Gaussian mixture over image patches, enabling an EM-based, gradient-free training procedure that converges without labeled data. By extending from a single patch to a mixture of patches and employing batch EM with image-wide pooling, the approach yields diverse, informative features while remaining computationally efficient through convolution- and group-based operations. The authors demonstrate the method on MNIST and STL-10, achieving competitive unsupervised features and clear gains when unlabeled data is leveraged, particularly on more complex datasets. The work provides a practical, theoretically grounded alternative to backpropagation for learning convolution weights in an unsupervised setting, with potential for stacking to build deeper representations in the future.
Abstract
In this paper, we showed that the feature map of a convolution layer is equivalent to the unnormalized log posterior of a special kind of Gaussian mixture for image modeling. Then we expanded the model to drive diverse features and proposed a corresponding EM algorithm to learn the model. Learning convolution weights using this approach is efficient, guaranteed to converge, and does not need supervised information. Code is available at: https://github.com/LifanLiang/CALM.
