Table of Contents
Fetching ...

Structured quantum learning via em algorithm for Boltzmann machines

Takeshi Kimura, Kohtaro Kato, Masahito Hayashi

TL;DR

The paper tackles the training bottlenecks of quantum Boltzmann machines by introducing a quantum em algorithm tailored to semi-quantum restricted Boltzmann machines, where quantum effects are confined to the hidden layer. By framing training as alternating e- and m-projections between exponential and mixture quantum state families, the method yields a tractable e-step and a convex m-step, mitigating barren plateaus and enabling scalable learning. Empirical results on multiple datasets show that the em approach often outperforms gradient descent, with polynomial Gibbs-state sampling and closed-form updates in the sqRBM setting. This work provides a principled, architecture-aware pathway for quantum generative modeling that blends information geometry with quantum state learning, and suggests avenues for faster convergence and broader QBMs.

Abstract

Quantum Boltzmann machines (QBMs) are generative models with potential advantages in quantum machine learning, yet their training is fundamentally limited by the barren plateau problem, where gradients vanish exponentially with system size. We introduce a quantum version of the em algorithm, an information-geometric generalization of the classical Expectation-Maximization method, which circumvents gradient-based optimization on non-convex functions. Implemented on a semi-quantum restricted Boltzmann machine (sqRBM) -- a hybrid architecture with quantum effects confined to the hidden layer -- our method achieves stable learning and outperforms gradient descent on multiple benchmark datasets. These results establish a structured and scalable alternative to gradient-based training in QML, offering a pathway to mitigate barren plateaus and enhance quantum generative modeling.

Structured quantum learning via em algorithm for Boltzmann machines

TL;DR

The paper tackles the training bottlenecks of quantum Boltzmann machines by introducing a quantum em algorithm tailored to semi-quantum restricted Boltzmann machines, where quantum effects are confined to the hidden layer. By framing training as alternating e- and m-projections between exponential and mixture quantum state families, the method yields a tractable e-step and a convex m-step, mitigating barren plateaus and enabling scalable learning. Empirical results on multiple datasets show that the em approach often outperforms gradient descent, with polynomial Gibbs-state sampling and closed-form updates in the sqRBM setting. This work provides a principled, architecture-aware pathway for quantum generative modeling that blends information geometry with quantum state learning, and suggests avenues for faster convergence and broader QBMs.

Abstract

Quantum Boltzmann machines (QBMs) are generative models with potential advantages in quantum machine learning, yet their training is fundamentally limited by the barren plateau problem, where gradients vanish exponentially with system size. We introduce a quantum version of the em algorithm, an information-geometric generalization of the classical Expectation-Maximization method, which circumvents gradient-based optimization on non-convex functions. Implemented on a semi-quantum restricted Boltzmann machine (sqRBM) -- a hybrid architecture with quantum effects confined to the hidden layer -- our method achieves stable learning and outperforms gradient descent on multiple benchmark datasets. These results establish a structured and scalable alternative to gradient-based training in QML, offering a pathway to mitigate barren plateaus and enhance quantum generative modeling.

Paper Structure

This paper contains 13 sections, 60 equations, 3 figures, 1 algorithm.

Figures (3)

  • Figure 1: Conceptual overview of our learning framework. This figure illustrates the em algorithm, an information-geometric generalization of the EM algorithm, proposed as an alternative approach to training QBMs. The algorithm iteratively minimizes the divergence between an exponential family (model manifold) and a mixture family (data manifold) via alternating projections. For demonstration, we apply the method to sqRBMs, where quantum effects are confined to the hidden layer.
  • Figure 2: Performance comparison between the em algorithm and the GD method. We train each model for a fixed number of epochs, where one epoch corresponds to a complete pass through the training data. The plot shows the final KL divergence values achieved by each algorithm on four different datasets (A, B, C, and D). Each point represents the average over 100 independent training runs. The em algorithm outperforms GD on datasets A, B, and D, while GD yields better results on dataset C.
  • Figure 3: Performance comparison of the em algorithm for sqRBM and RBM. We train each model for a fixed number of epochs, where one epoch corresponds to a complete pass through the training data. The plot shows the final KL divergence values achieved by each algorithm on four different datasets (A, B, C, and D). Each point represents the average over 100 independent training runs. The em algorithm for sqRBM outperforms RBM except for one result.