Table of Contents
Fetching ...

A phase transition in sampling from Restricted Boltzmann Machines

Youngwoo Kwon, Qian Qin, Guanyang Wang, Yuchen Wei

TL;DR

This study proves a phase transition phenomenon in the mixing time of the Gibbs sampler for a one-parameter Restricted Boltzmann Machine and develops a new isoperimetric inequality for the sampler's stationary distribution by showing that the distribution is nearly log-concave.

Abstract

Restricted Boltzmann Machines are a class of undirected graphical models that play a key role in deep learning and unsupervised learning. In this study, we prove a phase transition phenomenon in the mixing time of the Gibbs sampler for a one-parameter Restricted Boltzmann Machine. Specifically, the mixing time varies logarithmically, polynomially, and exponentially with the number of vertices depending on whether the parameter $c$ is above, equal to, or below a critical value $c_\star\approx-5.87$. A key insight from our analysis is the link between the Gibbs sampler and a dynamical system, which we utilize to quantify the former based on the behavior of the latter. To study the critical case $c= c_\star$, we develop a new isoperimetric inequality for the sampler's stationary distribution by showing that the distribution is nearly log-concave.

A phase transition in sampling from Restricted Boltzmann Machines

TL;DR

This study proves a phase transition phenomenon in the mixing time of the Gibbs sampler for a one-parameter Restricted Boltzmann Machine and develops a new isoperimetric inequality for the sampler's stationary distribution by showing that the distribution is nearly log-concave.

Abstract

Restricted Boltzmann Machines are a class of undirected graphical models that play a key role in deep learning and unsupervised learning. In this study, we prove a phase transition phenomenon in the mixing time of the Gibbs sampler for a one-parameter Restricted Boltzmann Machine. Specifically, the mixing time varies logarithmically, polynomially, and exponentially with the number of vertices depending on whether the parameter is above, equal to, or below a critical value . A key insight from our analysis is the link between the Gibbs sampler and a dynamical system, which we utilize to quantify the former based on the behavior of the latter. To study the critical case , we develop a new isoperimetric inequality for the sampler's stationary distribution by showing that the distribution is nearly log-concave.

Paper Structure

This paper contains 30 sections, 29 theorems, 156 equations, 4 figures, 1 algorithm.

Key Result

Theorem 2

With all the settings described in Section subsec:setup, let $x_{\star} \approx 1.278$ be the solution to the equation and let Then each of the following holds:

Figures (4)

  • Figure 1: A restricted Boltzmann machine with six hidden nodes and six visible nodes
  • Figure 2: In the first row, $m_c^t(0.5)$ is plotted against $t$ for three values of $c$. The second row contains autocorrelation function plots for the Markov chain $(X_t)_t$ at three values of $c$.
  • Figure 3: The first row depicts the function $x \mapsto m_c^2(x)$ at three values of $c$, alongside the $45^{\circ}$ diagonal line that passes through the origin. The second row portrays the probability mass function $x \mapsto \pi_{c,n}(\{x\})$ at three values of $c$ with $n = 1000$.
  • Figure 4: Plot of the second derivative of $\log \omega_{25}(x)$ for $x \in [0.1,0.5]$.

Theorems & Definitions (53)

  • Remark 1
  • Theorem 2
  • Proposition 3
  • Lemma 4
  • Lemma 5
  • Lemma 6
  • Proposition 7
  • Proposition 8
  • proof
  • Lemma 9
  • ...and 43 more