A Gaussian Process perspective on Convolutional Neural Networks
Anastasia Borovykh
TL;DR
The paper reframes convolutional neural networks within a Gaussian-process framework to uncover when CNN outputs behave like GP priors, despite the non-iid, compositionally dependent sums characteristic of convolutional layers. It leverages a Bentkus Lyapunov-type bound to justify GP-like behavior in the first layer and derives a recursive, convolutional kernel that evolves with depth, linking CNNs to additive/convolutional GP kernels. Numerical experiments using MMD show that, for moderate filter sizes and common activations, CNN priors quickly resemble GP priors, and CNN posteriors under input conditioning align with GP posteriors for time-series data. This work provides a principled Bayesian lens for CNNs, enabling analytic uncertainty via GP machinery and clarifying how the convolutional structure governs GP convergence and kernel formation.
Abstract
In this paper we cast the well-known convolutional neural network in a Gaussian process perspective. In this way we hope to gain additional insights into the performance of convolutional networks, in particular understand under what circumstances they tend to perform well and what assumptions are implicitly made in the network. While for fully-connected networks the properties of convergence to Gaussian processes have been studied extensively, little is known about situations in which the output from a convolutional network approaches a multivariate normal distribution.
