Table of Contents
Fetching ...

Density Operator Expectation Maximization

Adit Vishnu, Abhay Shastry, Dhruva Kashyap, Chiranjib Bhattacharyya

TL;DR

This work develops Density Operator Latent Variable Models (DO-LVMs) and a principled EM-like training framework called DO-EM, leveraging the Petz recovery map to perform the E-step in the density-operator setting. It introduces Classical-Quantum LVMs (CQ-LVMs) that combine classical-visible data with quantum latent states, and proves that DO-EM yields guaranteed log-likelihood ascent under suitable conditions via a quantum ELBO (QELBO) and information-projection arguments. The framework enables tractable training of quantum Boltzmann machine variants (e.g., QiDBM and QGRBM) using Contrastive Divergence to approximate the M-step, showing substantial generative performance gains over classical counterparts on MNIST-scale datasets. Empirical results demonstrate that DO-EM accelerates training by orders of magnitude relative to gradient-based methods while achieving competitive or superior FID scores, supporting the practical viability of density-operator models for real-world data with classical hardware. The work also lays groundwork for future quantum-hardware implementations by linking EM steps to recoverability maps and quantum information projections.

Abstract

Machine learning with density operators, the mathematical foundation of quantum mechanics, is gaining prominence with rapid advances in quantum computing. Generative models based on density operators cannot yet handle tasks that are routinely handled by probabilistic models. The progress of latent variable models, a broad and influential class of probabilistic unsupervised models, was driven by the Expectation-Maximization framework. Deriving such a framework for density operators is challenging due to the non-commutativity of operators. To tackle this challenge, an inequality arising from the monotonicity of relative entropy is demonstrated to serve as an evidence lower bound for density operators. A minorant-maximization perspective on this bound leads to Density Operator Expectation Maximization (DO-EM), a general framework for training latent variable models defined through density operators. Through an information-geometric argument, the Expectation step in DO-EM is shown to be the Petz recovery map. The DO-EM algorithm is applied to Quantum Restricted Boltzmann Machines, adapting Contrastive Divergence to approximate the Maximization step gradient. Quantum interleaved Deep Boltzmann Machines and Quantum Gaussian-Bernoulli Restricted Boltzmann Machines, new models introduced in this work, outperform their probabilistic counterparts on generative tasks when trained with similar computational resources and identical hyperparameters.

Density Operator Expectation Maximization

TL;DR

This work develops Density Operator Latent Variable Models (DO-LVMs) and a principled EM-like training framework called DO-EM, leveraging the Petz recovery map to perform the E-step in the density-operator setting. It introduces Classical-Quantum LVMs (CQ-LVMs) that combine classical-visible data with quantum latent states, and proves that DO-EM yields guaranteed log-likelihood ascent under suitable conditions via a quantum ELBO (QELBO) and information-projection arguments. The framework enables tractable training of quantum Boltzmann machine variants (e.g., QiDBM and QGRBM) using Contrastive Divergence to approximate the M-step, showing substantial generative performance gains over classical counterparts on MNIST-scale datasets. Empirical results demonstrate that DO-EM accelerates training by orders of magnitude relative to gradient-based methods while achieving competitive or superior FID scores, supporting the practical viability of density-operator models for real-world data with classical hardware. The work also lays groundwork for future quantum-hardware implementations by linking EM steps to recoverability maps and quantum information projections.

Abstract

Machine learning with density operators, the mathematical foundation of quantum mechanics, is gaining prominence with rapid advances in quantum computing. Generative models based on density operators cannot yet handle tasks that are routinely handled by probabilistic models. The progress of latent variable models, a broad and influential class of probabilistic unsupervised models, was driven by the Expectation-Maximization framework. Deriving such a framework for density operators is challenging due to the non-commutativity of operators. To tackle this challenge, an inequality arising from the monotonicity of relative entropy is demonstrated to serve as an evidence lower bound for density operators. A minorant-maximization perspective on this bound leads to Density Operator Expectation Maximization (DO-EM), a general framework for training latent variable models defined through density operators. Through an information-geometric argument, the Expectation step in DO-EM is shown to be the Petz recovery map. The DO-EM algorithm is applied to Quantum Restricted Boltzmann Machines, adapting Contrastive Divergence to approximate the Maximization step gradient. Quantum interleaved Deep Boltzmann Machines and Quantum Gaussian-Bernoulli Restricted Boltzmann Machines, new models introduced in this work, outperform their probabilistic counterparts on generative tasks when trained with similar computational resources and identical hyperparameters.

Paper Structure

This paper contains 49 sections, 19 theorems, 105 equations, 4 figures, 2 tables, 2 algorithms.

Key Result

Corollary 10

The Petz recovery map $\mathcal{R}_{\mathcal{N},\rho}:\mathcal{T}(\mathcal{H}_B)\to\mathcal{T}(\mathcal{H}_A)$ with respect to a CPTP map $\mathcal{N}:\mathcal{T}(\mathcal{H}_A)\to\mathcal{T}(\mathcal{H}_B)$ and a positive semi-definite operator $\rho$ in $\mathcal{T}(\mathcal{H}_A)$ is a CPTP map i

Figures (4)

  • Figure 1: (a) DO-EM vs Proejctive log-likelihood gradient-based training (PGT) with exact computation on mixture of Bernoulli data set. (b,c) QiDBM vs DBM on MNIST with CD.
  • Figure 2: Generated samples during training: DBM (left) QiDBM (right).
  • Figure 3: Fashion MNIST
  • Figure 4: CelebA-32

Theorems & Definitions (34)

  • Definition 1: Information Projection
  • Definition 2: Density Operator
  • Definition 3: Faithful Density Operator
  • Definition 4: Classical-Quantum State
  • Definition 5: Projective Measurement
  • Definition 6: CPTP map
  • Definition 7: Partial Trace
  • Definition 8: Conditional Amplitude Operator
  • Definition 9: Petz Recovery Map
  • Corollary 10
  • ...and 24 more