Exploring the Energy Landscape of RBMs: Reciprocal Space Insights into Bosons, Hierarchical Learning and Symmetry Breaking
J. Quetzalcóatl Toledo-Marin, Anindita Maiti, Geoffrey C. Fox, Roger G. Melko
TL;DR
The paper introduces a reciprocal-space formulation of Restricted Boltzmann Machines (RBMs) to illuminate connections between RBMs, diffusion processes, and systems of coupled Bosons. It shows that at initialization RBMs exhibit a saddle-point energy landscape with rotational symmetry in the singular-value spectrum governed by the Marčenko–Pastur distribution, and that training induces hierarchical learning that breaks this symmetry in a Landau-like fashion. In the infinite-size limit, reciprocal variables become Gaussian, leading to a decoupled or partially decoupled Bosonic picture and potential diffusion non-convergence for some modes. Through MNIST experiments with replicas of RBMs, the work demonstrates how singular-value structure and symmetry breaking relate to feature hierarchies and learning dynamics, offering a unifying perspective across generative model frameworks and suggesting new avenues for quantum-inspired and diffusion-based learning approaches.
Abstract
Deep generative models have become ubiquitous due to their ability to learn and sample from complex distributions. Despite the proliferation of various frameworks, the relationships among these models remain largely unexplored, a gap that hinders the development of a unified theory of AI learning. We address two central challenges: clarifying the connections between different deep generative models and deepening our understanding of their learning mechanisms. We focus on Restricted Boltzmann Machines (RBMs), known for their universal approximation capabilities for discrete distributions. By introducing a reciprocal space formulation, we reveal a connection between RBMs, diffusion processes, and coupled Bosons. We show that at initialization, the RBM operates at a saddle point, where the local curvature is determined by the singular values, whose distribution follows the Marcenko-Pastur law and exhibits rotational symmetry. During training, this rotational symmetry is broken due to hierarchical learning, where different degrees of freedom progressively capture features at multiple levels of abstraction. This leads to a symmetry breaking in the energy landscape, reminiscent of Landau theory. This symmetry breaking in the energy landscape is characterized by the singular values and the weight matrix eigenvector matrix. We derive the corresponding free energy in a mean-field approximation. We show that in the limit of infinite size RBM, the reciprocal variables are Gaussian distributed. Our findings indicate that in this regime, there will be some modes for which the diffusion process will not converge to the Boltzmann distribution. To illustrate our results, we trained replicas of RBMs with different hidden layer sizes using the MNIST dataset. Our findings bridge the gap between disparate generative frameworks and also shed light on the processes underpinning learning in generative models.
