Generative training of quantum Boltzmann machines with hidden units
Nathan Wiebe, Leonard Wossnig
TL;DR
This work tackles the challenge of fully quantum generative training for quantum Boltzmann machines that include hidden units by formulating the training objective as quantum relative entropy and proposing two complementary gradient-solutions. The first method leverages a variational upper bound for a restricted class of QBMs with commuting hidden Hamiltonians, yielding tractable gradient expressions and practical efficiency. The second method develops a general gradient estimation framework that uses Fourier-like log approximations and high-order divided differences to approximate the exact gradient, supported by quantum subroutines such as Gibbs-state preparation, amplitude estimation, and the Hadamard test. Together, these approaches establish conditions under which QBMs can be trained on quantum devices and provide complexity analyses, while also outlining open problems in optimal Gibbs-state preparation and potential lower bounds. This advances practical quantum generative modeling by enabling training with hidden units and nontrivial quantum correlations.
Abstract
In this article we provide a method for fully quantum generative training of quantum Boltzmann machines with both visible and hidden units while using quantum relative entropy as an objective. This is significant because prior methods were not able to do so due to mathematical challenges posed by the gradient evaluation. We present two novel methods for solving this problem. The first proposal addresses it, for a class of restricted quantum Boltzmann machines with mutually commuting Hamiltonians on the hidden units, by using a variational upper bound on the quantum relative entropy. The second one uses high-order divided difference methods and linear-combinations of unitaries to approximate the exact gradient of the relative entropy for a generic quantum Boltzmann machine. Both methods are efficient under the assumption that Gibbs state preparation is efficient and that the Hamiltonian are given by a sparse row-computable matrix.
