Table of Contents
Fetching ...

Variational Inference for Quantum HyperNetworks

Luca Nepote, Alix Lhéritier, Nicolas Bondoux, Marios Kountouris, Maurizio Filippone

TL;DR

This work links Quantum HyperNetworks with Bayesian inference by deriving an explicit ELBO and a surrogate SELBO to train binary-weight networks via variational principles. By mapping BiNN weights to quantum circuit outcomes and using either full distribution access or implicit-sample distributions, the approach provides principled regularization that improves trainability and generalization over standard MLE. Empirical results on simple toy datasets show SELBO can yield higher accuracy and smoother optimization, suggesting practical benefits for quantum-inspired training of low-precision networks. The framework sets the stage for future hardware validation, scalability, and exploration of alternative divergences in quantum variational inference.

Abstract

Binary Neural Networks (BiNNs), which employ single-bit precision weights, have emerged as a promising solution to reduce memory usage and power consumption while maintaining competitive performance in large-scale systems. However, training BiNNs remains a significant challenge due to the limitations of conventional training algorithms. Quantum HyperNetworks offer a novel paradigm for enhancing the optimization of BiNN by leveraging quantum computing. Specifically, a Variational Quantum Algorithm is employed to generate binary weights through quantum circuit measurements, while key quantum phenomena such as superposition and entanglement facilitate the exploration of a broader solution space. In this work, we establish a connection between this approach and Bayesian inference by deriving the Evidence Lower Bound (ELBO), when direct access to the output distribution is available (i.e., in simulations), and introducing a surrogate ELBO based on the Maximum Mean Discrepancy (MMD) metric for scenarios involving implicit distributions, as commonly encountered in practice. Our experimental results demonstrate that the proposed methods outperform standard Maximum Likelihood Estimation (MLE), improving trainability and generalization.

Variational Inference for Quantum HyperNetworks

TL;DR

This work links Quantum HyperNetworks with Bayesian inference by deriving an explicit ELBO and a surrogate SELBO to train binary-weight networks via variational principles. By mapping BiNN weights to quantum circuit outcomes and using either full distribution access or implicit-sample distributions, the approach provides principled regularization that improves trainability and generalization over standard MLE. Empirical results on simple toy datasets show SELBO can yield higher accuracy and smoother optimization, suggesting practical benefits for quantum-inspired training of low-precision networks. The framework sets the stage for future hardware validation, scalability, and exploration of alternative divergences in quantum variational inference.

Abstract

Binary Neural Networks (BiNNs), which employ single-bit precision weights, have emerged as a promising solution to reduce memory usage and power consumption while maintaining competitive performance in large-scale systems. However, training BiNNs remains a significant challenge due to the limitations of conventional training algorithms. Quantum HyperNetworks offer a novel paradigm for enhancing the optimization of BiNN by leveraging quantum computing. Specifically, a Variational Quantum Algorithm is employed to generate binary weights through quantum circuit measurements, while key quantum phenomena such as superposition and entanglement facilitate the exploration of a broader solution space. In this work, we establish a connection between this approach and Bayesian inference by deriving the Evidence Lower Bound (ELBO), when direct access to the output distribution is available (i.e., in simulations), and introducing a surrogate ELBO based on the Maximum Mean Discrepancy (MMD) metric for scenarios involving implicit distributions, as commonly encountered in practice. Our experimental results demonstrate that the proposed methods outperform standard Maximum Likelihood Estimation (MLE), improving trainability and generalization.

Paper Structure

This paper contains 9 sections, 22 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: Representation of the Regularized Quantum HyperNetworks algorithm.
  • Figure 2: Datasets used in the experiments: 2D Gaussian dataset (a), 2D Moon dataset (b), 2D Ring dataset (c).
  • Figure 3: Average (S)ELBO for $N_\text{layers}=1$, $N_{\text{qc}} = 100$, run for 100 different initializations. Gaussian dataset. The KL is represented up to the constant term.
  • Figure 4: Training curves for (S)ELBO and MLE, $N_\text{layers}=1$, $N_\text{qc} = 100$, 100 different initializations.
  • Figure 5: Evolution of the average gradient magnitude during training.
  • ...and 1 more figures