Table of Contents
Fetching ...

Generative diffusion for perceptron problems: statistical physics analysis and efficient algorithms

Elizaveta Demyanenko, Davide Straziota, Carlo Baldassi, Carlo Lucibello

TL;DR

This work develops a replica-informed diffusion framework (Algorithmic Stochastic Localization, ASL) to assess and enhance sampling from high-dimensional, unnormalized densities, focusing on random perceptron problems. By applying a double replica trick, the authors derive time-dependent free entropy φ_t and identify algorithmic thresholds for successful sampling with AMP-driven diffusion. They show that the spherical perceptron can be efficiently sampled in large RS regions, while the uniform sampling of the binary perceptron is inherently hard; introducing a diverging potential U(s) = −log(s) enables efficient tilted sampling and, coupled with tau-annealed MCMC, provides a robust practical sampler. The results illuminate how the geometry of the solution space (RS vs. RSB) governs sampler performance and suggest concrete algorithmic strategies to tackle hard constraint-satisfaction problems beyond perceptrons.

Abstract

We consider random instances of non-convex perceptron problems in the high-dimensional limit of a large number of examples $M$ and weights $N$, with finite load $α= M/N$. We develop a formalism based on replica theory to predict the fundamental limits of efficiently sampling the solution space using generative diffusion algorithms, conjectured to be saturated when the score function is provided by Approximate Message Passing. For the spherical perceptron with negative margin $κ$, we find that the uniform distribution over solutions can be efficiently sampled in most of the Replica Symmetric region of the $α$-$κ$ plane. In contrast, for binary weights, sampling from the uniform distribution remains intractable. A theoretical analysis of this obstruction leads us to identify a potential $U(s) = -\log(s)$, under which the corresponding tilted distribution becomes efficiently samplable via diffusion. Moreover, we show numerically that an annealing procedure over the shape of this potential yields a fast and robust Markov Chain Monte Carlo algorithm for sampling the solution space of the binary perceptron.

Generative diffusion for perceptron problems: statistical physics analysis and efficient algorithms

TL;DR

This work develops a replica-informed diffusion framework (Algorithmic Stochastic Localization, ASL) to assess and enhance sampling from high-dimensional, unnormalized densities, focusing on random perceptron problems. By applying a double replica trick, the authors derive time-dependent free entropy φ_t and identify algorithmic thresholds for successful sampling with AMP-driven diffusion. They show that the spherical perceptron can be efficiently sampled in large RS regions, while the uniform sampling of the binary perceptron is inherently hard; introducing a diverging potential U(s) = −log(s) enables efficient tilted sampling and, coupled with tau-annealed MCMC, provides a robust practical sampler. The results illuminate how the geometry of the solution space (RS vs. RSB) governs sampler performance and suggest concrete algorithmic strategies to tackle hard constraint-satisfaction problems beyond perceptrons.

Abstract

We consider random instances of non-convex perceptron problems in the high-dimensional limit of a large number of examples and weights , with finite load . We develop a formalism based on replica theory to predict the fundamental limits of efficiently sampling the solution space using generative diffusion algorithms, conjectured to be saturated when the score function is provided by Approximate Message Passing. For the spherical perceptron with negative margin , we find that the uniform distribution over solutions can be efficiently sampled in most of the Replica Symmetric region of the - plane. In contrast, for binary weights, sampling from the uniform distribution remains intractable. A theoretical analysis of this obstruction leads us to identify a potential , under which the corresponding tilted distribution becomes efficiently samplable via diffusion. Moreover, we show numerically that an annealing procedure over the shape of this potential yields a fast and robust Markov Chain Monte Carlo algorithm for sampling the solution space of the binary perceptron.

Paper Structure

This paper contains 54 sections, 80 equations, 15 figures, 1 algorithm.

Figures (15)

  • Figure 1: Asymptotic analysis of ASL sampling for the Spherical Perceptron with uniform distribution. Left: Free entropy function $\phi_t(q)$ for different values of $t$ and for $\alpha=278, \kappa=-2.5$. Initially, $\phi_t(q)$ has a single maximum, but as $t$ increases, a second maximum appears, eventually becoming the global one. Center: Phase diagram of ASL in the $t$-vs-$\alpha$ plane for $\kappa =-2.5$. Green region:$\phi_t(q )$ has a single optimizer, meaning the AMP succeeds at denoising. Yellow region:$\phi_t(q)$ has two optimizers, but the global maximum corresponds to the smaller overlap $q$. AMP still succeeds. Red region:$\phi_t(q)$ has two optimizers, but the global maximum corresponds to a larger overlap $q$. In this case, AMP fails the denoising task. In order for ASL to succeed at sampling, a vertical line at the corresponding $\alpha$ should lie entirely in the green region. Right: Phase diagram delineating the samplable and non-samplable regions for ASL in $\alpha$-vs-$\kappa$ plane. Transition lines predicted from replica theory are taken from baldassi2023typical. The green region can be sampled by ASL. The zoom in the inset shows the failure of ASL at reaching the d1RSB line.
  • Figure 2: ASL sampling for the Binary Perceptron with $U(s)=-\log(s)$ potential. Left: Phase diagram of ASL in the $t$-vs-$\alpha$ plane for $\kappa =0$ and $T =0.5$. The color scheme is the same as for the central panel of Figure \ref{['fig:free_entropy_spherical']}. Right: Empirical distribution of the stabilities $s^\mu$ for a configuration obtained by ASL in the case of binary perceptron with the $\log$-potential, $N=5000$, $\kappa=0, T=0.5$, and $\alpha=0.3$. The black line is the asymptotic theoretical prediction. The excellent agreement shows that ASL produces fair samples from the target distribution.
  • Figure 3: Results for the Binary Perceptron problem, showing the probability of finding a solution as a function of constraint density $\alpha$ and for different system sizes $N$, after 100 sweeps of MCMC. Simulated Annealing on temperature $T$ (left) is compared to our proposed and much more effective $\tau$-annealing scheme.
  • Figure 4: Distribution of the stabilities $s^\mu$ of a spherical perceptron with negative margin, where $s^{\mu}=\frac{\langle\boldsymbol{w}, \boldsymbol{x}^\mu\rangle}{\sqrt{N}}$. For number of variables $N=1001$, $\kappa=-2.1$ and $\alpha=5$ (Left), $\alpha=20$ (Center) and $\alpha=80$ (Right). The empirical distribution of the stabilities of the ASL samples (blue) coincides for different parameter regimes with the distribution computed through replica method (black solid line).
  • Figure 5: Left: Free entropy $\phi_t(q)$ for the binary perceptron problem for several times. For all $t$, the curves have two local maxima: one at low $q$ that grows with $t$, and a persistent one at $q=1$. Eventually, the persistent one becomes global, while the lower-$q$ one is still present, which leads to the failure of ASL. Right: Phase diagram of SL sampling for the binary perceptron. At all $\alpha$ the situation is like the one in the left panel: there is a region of $t$ where the lower-$q$ maximum is not the global maximum (red regions) and sampling fails, even though eventually the lower-$q$ maximum eventually disappears (green regions).
  • ...and 10 more figures