High-dimensional manifold of solutions in neural networks: insights from statistical physics
Enrico M. Malatesta
TL;DR
This work surveys a statistical-physics view of high-dimensional neural network solution spaces, focusing on the perceptron with binary or spherical weights and the SAT/UNSAT transition at fixed α = P/N. It leverages the replica method to derive Gardner volumes and RS saddle-point equations, elucidating how the landscape shrinks and restructures as constraints accumulate. It further connects geometry to learning dynamics by analyzing local entropy (Franz-Parisi) and delineating algorithmic hardness via the Overlap Gap Property, while showing how high-entropy regions can enhance robustness and generalization. Finally, it examines linear mode connectivity between solutions, identifying regimes where straight paths of zero training error exist or fail and highlighting a kernel region that supports connectivity in the overparameterized regime. Together, these results illuminate the global shape of neural-network solution manifolds and their implications for optimization and generalization.
Abstract
In these pedagogic notes I review the statistical mechanics approach to neural networks, focusing on the paradigmatic example of the perceptron architecture with binary an continuous weights, in the classification setting. I will review the Gardner's approach based on replica method and the derivation of the SAT/UNSAT transition in the storage setting. Then, I discuss some recent works that unveiled how the zero training error configurations are geometrically arranged, and how this arrangement changes as the size of the training set increases. I also illustrate how different regions of solution space can be explored analytically and how the landscape in the vicinity of a solution can be characterized. I give evidence how, in binary weight models, algorithmic hardness is a consequence of the disappearance of a clustered region of solutions that extends to very large distances. Finally, I demonstrate how the study of linear mode connectivity between solutions can give insights into the average shape of the solution manifold.
