Table of Contents
Fetching ...

Quantum Boltzmann Machine

Mohammad H. Amin, Evgeny Andriyash, Jason Rolfe, Bohdan Kulchytskyy, Roger Melko

TL;DR

This work proposes a Quantum Boltzmann Machine (QBM) that uses the quantum Boltzmann distribution of a transverse-field Ising Hamiltonian to model data, addressing the core challenge of training when $H$ and $\partial_\theta H$ do not commute. To enable efficient training, the authors introduce a bound on the log-likelihood via the Golden-Thompson inequality, yielding a tractable gradient-based optimization called bound-based QBM ($\tilde{\mathcal L}$), along with a restricted version (RQBM) that allows exact positive-phase calculations. Through small-scale experiments on fully visible, restricted, and generative supervised settings, they demonstrate that QBM and $\tilde{\mathcal L}$ can outperform classical Boltzmann machines on synthetic multi-modal data, while highlighting fundamental differences in sampling conditional on inputs. Finally, they discuss training QBM on quantum annealers, noting both the potential for approximate quantum Boltzmann sampling and practical limitations due to dynamics and freeze-out behavior, outlining a path toward hardware-assisted quantum probabilistic learning.

Abstract

Inspired by the success of Boltzmann Machines based on classical Boltzmann distribution, we propose a new machine learning approach based on quantum Boltzmann distribution of a transverse-field Ising Hamiltonian. Due to the non-commutative nature of quantum mechanics, the training process of the Quantum Boltzmann Machine (QBM) can become nontrivial. We circumvent the problem by introducing bounds on the quantum probabilities. This allows us to train the QBM efficiently by sampling. We show examples of QBM training with and without the bound, using exact diagonalization, and compare the results with classical Boltzmann training. We also discuss the possibility of using quantum annealing processors like D-Wave for QBM training and application.

Quantum Boltzmann Machine

TL;DR

This work proposes a Quantum Boltzmann Machine (QBM) that uses the quantum Boltzmann distribution of a transverse-field Ising Hamiltonian to model data, addressing the core challenge of training when and do not commute. To enable efficient training, the authors introduce a bound on the log-likelihood via the Golden-Thompson inequality, yielding a tractable gradient-based optimization called bound-based QBM (), along with a restricted version (RQBM) that allows exact positive-phase calculations. Through small-scale experiments on fully visible, restricted, and generative supervised settings, they demonstrate that QBM and can outperform classical Boltzmann machines on synthetic multi-modal data, while highlighting fundamental differences in sampling conditional on inputs. Finally, they discuss training QBM on quantum annealers, noting both the potential for approximate quantum Boltzmann sampling and practical limitations due to dynamics and freeze-out behavior, outlining a path toward hardware-assisted quantum probabilistic learning.

Abstract

Inspired by the success of Boltzmann Machines based on classical Boltzmann distribution, we propose a new machine learning approach based on quantum Boltzmann distribution of a transverse-field Ising Hamiltonian. Due to the non-commutative nature of quantum mechanics, the training process of the Quantum Boltzmann Machine (QBM) can become nontrivial. We circumvent the problem by introducing bounds on the quantum probabilities. This allows us to train the QBM efficiently by sampling. We show examples of QBM training with and without the bound, using exact diagonalization, and compare the results with classical Boltzmann training. We also discuss the possibility of using quantum annealing processors like D-Wave for QBM training and application.

Paper Structure

This paper contains 13 sections, 57 equations, 4 figures.

Figures (4)

  • Figure 1: (a) An example of a quantum Boltzmann machine with visible (blue) and hidden (red) qubits. (b) A restricted quantum Boltzmann machine with no lateral connection between the hidden variables. (c) Discriminative learning with QBM. The (green) squares represent classical input ${\bf x}$, which are not necessarily binary numbers. The input applies energy biases to the hidden and output qubits according to the coupling coefficients represented by solid lines.
  • Figure 2: Training of a fully visible fully connected model with $N=10$ qubits on artificial data from Bernoulli mixture model (\ref{['eq:mixture_model']}). Training is done using second-order optimization routine BFGS. (a) KL-divergence (\ref{['eq:KL_def']}) of BM, QBM, bQBM models during training process. Both QBM and bQBM learn to KL values that are lower than that for BM. (b) Classical and quantum average energies (\ref{['eq:cl_q_energies']}) during training process.
  • Figure 3: Training of a restricted RBM with 8 visible and 2 hidden units on artificial data from Bernoulli mixture model (\ref{['eq:mixture_model']}) using second-order optimization routine. (a) KL-divergence (\ref{['eq:KL_def']}) of different models during training process. Again QBM and bQBM outperform BM, but when positive phase was calculated classically in bQBM, the performance (bQBM-CE curve in the figure) deteriorated and became worse than that for BM (see Section \ref{['subsec:RBM']} for details). (b) Classical and quantum average energies (\ref{['eq:cl_q_energies']}) during training process.
  • Figure 4: Supervised learning using fully visible fully connected model with $N=11$ qubits divided into 8 inputs and 3 outputs. As our training data we use artificial data from Bernoulli mixture model (\ref{['eq:mixture_model']}) for inputs and 3-bit binary labels (0 to 7) for outputs. Training is done using second-order optimization routine BFGS. (a) KL-divergence of joint distribution (\ref{['eq:L_gen']}) of BM, QBM models during training process. Once again QBM learns the distribution better than BM. (b) KL-divergence of conditional distribution during the same training, for BM, QBM models using (\ref{['eq:L_discr']}), and for clamped QBM (QBM-clamped) using (\ref{['eq:conditional_prob_clamped']}). The conditional distribution is also learned better by QBM than BM, but the clamped QBM distribution is very different from the conditional one and give a KL-divergence much higher than the classical BM.