Efficient Learning of Convolution Weights as Gaussian Mixture Model Posteriors

Lifan Liang

Efficient Learning of Convolution Weights as Gaussian Mixture Model Posteriors

Lifan Liang

TL;DR

This paper reframes a convolution layer as computing the unnormalized log posterior of a Gaussian mixture over image patches, enabling an EM-based, gradient-free training procedure that converges without labeled data. By extending from a single patch to a mixture of patches and employing batch EM with image-wide pooling, the approach yields diverse, informative features while remaining computationally efficient through convolution- and group-based operations. The authors demonstrate the method on MNIST and STL-10, achieving competitive unsupervised features and clear gains when unlabeled data is leveraged, particularly on more complex datasets. The work provides a practical, theoretically grounded alternative to backpropagation for learning convolution weights in an unsupervised setting, with potential for stacking to build deeper representations in the future.

Abstract

In this paper, we showed that the feature map of a convolution layer is equivalent to the unnormalized log posterior of a special kind of Gaussian mixture for image modeling. Then we expanded the model to drive diverse features and proposed a corresponding EM algorithm to learn the model. Learning convolution weights using this approach is efficient, guaranteed to converge, and does not need supervised information. Code is available at: https://github.com/LifanLiang/CALM.

Efficient Learning of Convolution Weights as Gaussian Mixture Model Posteriors

TL;DR

Abstract

Paper Structure (14 sections, 10 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 14 sections, 10 equations, 6 figures, 1 table, 1 algorithm.

Introduction
Related Work
Model Formulation
The generative model with a single patch
The generative model with multiple patches
Learning Algorithm
Expectional maximization (EM)
Convergence of the algorithm
Batch EM
Image-wide pooling
Result
Application to MNIST
Application to STL-10
Conclusion

Figures (6)

Figure 1: Graphical illustration of the single patch model and multiple patch model. Pixels in gray area are sampled from standard normal. Patches in different colors are sampled from different $MVN$
Figure 2: Simplified illustration of the EM algorithm developed in this paper. E-step is a conventional convolution; M-step uses the feature map as convolution filter.
Figure 3: Visualization of the 64 features extracted from the MNIST images.
Figure 4: Unnormalized marginal likelihood along with epochs on the MNIST dataset. EM algorithm converged after the second epoch.
Figure 5: Visualization of the first 64 features extracted from the STL10 images (labeled training data and unlabeled data).
...and 1 more figures

Efficient Learning of Convolution Weights as Gaussian Mixture Model Posteriors

TL;DR

Abstract

Efficient Learning of Convolution Weights as Gaussian Mixture Model Posteriors

Authors

TL;DR

Abstract

Table of Contents

Figures (6)