Learning with Restricted Boltzmann Machines: Asymptotics of AMP and GD in High Dimensions
Yizhou Xu, Florent Krzakala, Lenka Zdeborová
TL;DR
This paper analyzes Restricted Boltzmann Machines (RBMs) in the high-dimensional regime where $n,d\to\infty$, $n/d=\alpha=\Theta(1)$ and the number of hidden units $k$ remains fixed. It derives an exact reduction of the RBM likelihood to an effective unsupervised multi-index objective with a non-separable regularization, enabling rigorous AMP state evolution (SE) and dynamical mean-field theory (DMFT) analyses of training dynamics. By mapping data from the spiked covariance model to a teacher RBM, the authors prove that RBMs achieve the BBP weak recovery threshold and provide sharp, high-dimensional asymptotics for both AMP and gradient-descent training. The results establish a principled bridge between unsupervised RBM learning and high-dimensional inference techniques, offering precise predictions for optimization and dynamics and guiding future extensions to more complex generative architectures.
Abstract
The Restricted Boltzmann Machine (RBM) is one of the simplest generative neural networks capable of learning input distributions. Despite its simplicity, the analysis of its performance in learning from the training data is only well understood in cases that essentially reduce to singular value decomposition of the data. Here, we consider the limit of a large dimension of the input space and a constant number of hidden units. In this limit, we simplify the standard RBM training objective into a form that is equivalent to the multi-index model with non-separable regularization. This opens a path to analyze training of the RBM using methods that are established for multi-index models, such as Approximate Message Passing (AMP) and its state evolution, and the analysis of Gradient Descent (GD) via the dynamical mean-field theory. We then give rigorous asymptotics of the training dynamics of RBM on data generated by the spiked covariance model as a prototype of a structure suitable for unsupervised learning. We show in particular that RBM reaches the optimal computational weak recovery threshold, aligning with the BBP transition, in the spiked covariance model.
