Table of Contents
Fetching ...

Expressive equivalence of classical and quantum restricted Boltzmann machines

Maria Demidik, Cenk Tüysüz, Nico Piatkowski, Michele Grossi, Karl Jansen

TL;DR

This work introduces semi-quantum restricted Boltzmann machines (sqRBMs) as an intermediate model between classical RBMs and quantum RBMs, designed for efficient gradient computation on classical data. By making the visible-subspace Hamiltonian commuting while allowing non-commuting terms on hidden units, sqRBMs enable closed-form output probabilities and gradients, mitigating the gradient-cost issues of generic QRBMs. The authors prove expressive equivalence between sqRBMs and RBMs, showing $\mathrm{sqRBM}_{n,m} \equiv \mathrm{RBM}_{n,|\mathcal{W}_{\rm h}|\cdot m}$, implying RBMs require about $3$ times as many hidden units as sqRBMs for the same distribution, with the same total parameter count. Numerical experiments up to 100 units corroborate the theory, demonstrating competitive learning with reduced quantum-resource requirements and suggesting near-term practicality for quantum-assisted generative modeling. Overall, sqRBMs offer a concrete route to leverage quantum hardware for probabilistic modeling while curbing resource demands.

Abstract

Quantum computers offer the potential for efficiently sampling from complex probability distributions, attracting increasing interest in generative modeling within quantum machine learning. This surge in interest has driven the development of numerous generative quantum models, yet their trainability and scalability remain significant challenges. A notable example is a quantum restricted Boltzmann machine (QRBM), which is based on the Gibbs state of a parameterized non-commuting Hamiltonian. While QRBMs are expressive, their non-commuting Hamiltonians make gradient evaluation computationally demanding, even on fault-tolerant quantum computers. In this work, we propose a semi-quantum restricted Boltzmann machine (sqRBM), a model designed for classical data that mitigates the challenges associated with previous QRBM proposals. The sqRBM Hamiltonian is commuting in the visible subspace while remaining non-commuting in the hidden subspace. This structure allows us to derive closed-form expressions for both output probabilities and gradients. Leveraging these analytical results, we demonstrate that sqRBMs share a close relationship with classical restricted Boltzmann machines (RBM). Our theoretical analysis predicts that, to learn a given probability distribution, an RBM requires three times as many hidden units as an sqRBM, while both models have the same total number of parameters. We validate these findings through numerical simulations involving up to 100 units. Our results suggest that sqRBMs could enable practical quantum machine learning applications in the near future by significantly reducing quantum resource requirements.

Expressive equivalence of classical and quantum restricted Boltzmann machines

TL;DR

This work introduces semi-quantum restricted Boltzmann machines (sqRBMs) as an intermediate model between classical RBMs and quantum RBMs, designed for efficient gradient computation on classical data. By making the visible-subspace Hamiltonian commuting while allowing non-commuting terms on hidden units, sqRBMs enable closed-form output probabilities and gradients, mitigating the gradient-cost issues of generic QRBMs. The authors prove expressive equivalence between sqRBMs and RBMs, showing , implying RBMs require about times as many hidden units as sqRBMs for the same distribution, with the same total parameter count. Numerical experiments up to 100 units corroborate the theory, demonstrating competitive learning with reduced quantum-resource requirements and suggesting near-term practicality for quantum-assisted generative modeling. Overall, sqRBMs offer a concrete route to leverage quantum hardware for probabilistic modeling while curbing resource demands.

Abstract

Quantum computers offer the potential for efficiently sampling from complex probability distributions, attracting increasing interest in generative modeling within quantum machine learning. This surge in interest has driven the development of numerous generative quantum models, yet their trainability and scalability remain significant challenges. A notable example is a quantum restricted Boltzmann machine (QRBM), which is based on the Gibbs state of a parameterized non-commuting Hamiltonian. While QRBMs are expressive, their non-commuting Hamiltonians make gradient evaluation computationally demanding, even on fault-tolerant quantum computers. In this work, we propose a semi-quantum restricted Boltzmann machine (sqRBM), a model designed for classical data that mitigates the challenges associated with previous QRBM proposals. The sqRBM Hamiltonian is commuting in the visible subspace while remaining non-commuting in the hidden subspace. This structure allows us to derive closed-form expressions for both output probabilities and gradients. Leveraging these analytical results, we demonstrate that sqRBMs share a close relationship with classical restricted Boltzmann machines (RBM). Our theoretical analysis predicts that, to learn a given probability distribution, an RBM requires three times as many hidden units as an sqRBM, while both models have the same total number of parameters. We validate these findings through numerical simulations involving up to 100 units. Our results suggest that sqRBMs could enable practical quantum machine learning applications in the near future by significantly reducing quantum resource requirements.

Paper Structure

This paper contains 21 sections, 8 theorems, 62 equations, 6 figures, 1 table.

Key Result

Proposition 1

A $\mathrm{QRBM}_{n,m}$ can be trained to minimize the negative log-likelihood with respect to the target probability distribution $q$ using the following gradient rule: where $\theta_i \in \boldsymbol{\theta}$ is any real-valued parameter of the model, when the Hamiltonian terms are grouped such that $H = \sum_i \theta_i H_i$ and $\boldsymbol{\theta} \in \{\boldsymbol{a}, \boldsymbol{b}, \boldsy

Figures (6)

  • Figure 1: Summary of main results. This work introduces semi-quantum restricted Boltzmann machines (sqRBM) as an intermediate model, satisfying the relation $\mathrm{QRBM} \supseteq \mathrm{sqRBM} \supseteq \mathrm{RBM}$. sqRBMs generalize RBMs by rendering the hidden units quantum through the use of non-commuting Hamiltonians. In Theorem \ref{['theorem:equivalance']}, we show that $\mathrm{sqRBM}_{n,m} \equiv \mathrm{RBM}_{n,3m}$, where $n$ and $m$ denote the number of visible and hidden units, respectively, with both models having the same number of parameters. In pedestrian terms, RBMs require three times as many hidden units as sqRBMs to learn the same target distribution.
  • Figure 2: Connectivity graph of restricted Boltzmann machines (RBM). An RBM model has connections only between visible and hidden units. Lateral connections (e.g. visible to visible) are not permitted.
  • Figure 3: Training results. We train three models ($\mathrm{RBM}$, $\mathrm{sqRBM} \{X, Z\}$ and $\mathrm{sqRBM} \{X, Y, Z\}$) over four datasets with three different input sizes ($n \in \{8,10,12\}$) and various number of hidden units in the range $m \in [1,90]$. We report the total variation distance (TVD) measured after training all models 100 times with different initial parameters. The solid lines report the average, while the shades indicate the standard deviation. Each column reports results for a different dataset, ordered in increasing difficulty from left to right. The target probability distribution for $\mathcal{O}(n^2)$ dataset is varied for each run, using the same 100 seed for all models. The same target probability distribution is used for the other datasets in all runs.
  • Figure 4: Minimum number of hidden units required to learn target probability distributions on average. We report the minimum number of hidden units $m$ required to achieve $\mathrm{TVD} < 0.2$ on average, over four datasets for various input sizes ($n \in \{6,8,10,12\}$) using three models ($\mathrm{RBM}$, $\mathrm{sqRBM} \{X, Z\}$ and $\mathrm{sqRBM} \{X, Y, Z\}$) as in Figure \ref{['fig:training-results']}. In the bottom panel we provide the ratio of $m_\mathrm{RBM}$ to $m$ of the other models. The difficulty of each dataset results in a different scaling behavior. Recall that $\mathrm{RBM}$ has the same number of parameters and expressivity according to Theorem \ref{['theorem:equivalance']} as $\mathrm{sqRBM} \{X, Z\}$ for the ratio $m_\mathrm{RBM} = 2m$ and similarly $\mathrm{sqRBM} \{X, Y, Z\}$ for $m_\mathrm{RBM} = 3m$.
  • Figure 5: Variance of gradients. We report the variance of the gradients for the first parameter of different types of parameters ($a_1$, $b_1$, $w_{1,1}$). Each row shows results for different types of parameters, while each column shows results for different models. We report values for $n \in \{4,6,8,10,12\}$ and $m \in \{1,2,3,4,5,6\}$ and on all panels the variances show a similar behavior with very small variation with system size, as it can be observed with the overlapping lines. For models with $X$, $Y$ or $Z$ terms, we plot the results with a different marker, which also overlap.
  • ...and 1 more figures

Theorems & Definitions (20)

  • Definition 1: Restricted Boltzmann machine (RBM)
  • Definition 2: Quantum restricted Boltzmann machine (QRBM)
  • Proposition 1: Gradients of QRBM
  • Definition 3: Semi-quantum RBM
  • Proposition 2: Output probabilities of sqRBM
  • Proposition 3: Gradients of sqRBM
  • Definition 4: Semi-quantum BM
  • Proposition 4: Gradients of sqBM
  • Theorem 1: Equivalence of sqRBM hidden units to RBM multiple hidden units.
  • Theorem 2: Equivalence of sqRBM to RBM.
  • ...and 10 more