On the role of non-linear latent features in bipartite generative neural networks
Tony Bonnaire, Giovanni Catania, Aurélien Decelle, Beatriz Seoane
TL;DR
This work analyzes how hidden-unit priors in Restricted Boltzmann Machines shape their phase diagrams and memory retrieval capabilities. Using replica theory and finite-size Monte Carlo simulations, it shows that binary hidden units severely limit recall in the high-load regime, while enriching the hidden prior with ternary or ReLU-like activations and incorporating local biases restores retrieval and can suppress spin-glass phases at finite temperature. The results demonstrate that hidden-unit design—not just visible units or learning dynamics—critically governs the expressive power and memory capabilities of bipartite generative networks, with implications for generation quality and data-driven modeling. Overall, the paper elucidates how architectural choices modulate higher-order interactions in RBMs and their connection to classical associative memory models like Hopfield networks.
Abstract
We investigate the phase diagram and memory retrieval capabilities of bipartite energy-based neural networks, namely Restricted Boltzmann Machines (RBMs), as a function of the prior distribution imposed on their hidden units - including binary, multi-state, and ReLU-like activations. Drawing connections to the Hopfield model and employing analytical tools from statistical physics of disordered systems, we explore how the architectural choices and activation functions shape the thermodynamic properties of these models. Our analysis reveals that standard RBMs with binary hidden nodes and extensive connectivity suffer from reduced critical capacity, limiting their effectiveness as associative memories. To address this, we examine several modifications, such as introducing local biases and adopting richer hidden unit priors. These adjustments restore ordered retrieval phases and markedly improve recall performance, even at finite temperatures. Our theoretical findings, supported by finite-size Monte Carlo simulations, highlight the importance of hidden unit design in enhancing the expressive power of RBMs.
