Table of Contents
Fetching ...

Application of Quantum Annealing to Training of Deep Neural Networks

Steven H. Adachi, Maxwell P. Henderson

TL;DR

This work investigated an alternative approach that estimates model expectations of Restricted Boltzmann Machines using samples from a D-Wave quantum annealing machine, and found that the quantum sampling- based training approach achieves comparable or better accuracy with significantly fewer iterations of generative training than conventional CD-based training.

Abstract

In Deep Learning, a well-known approach for training a Deep Neural Network starts by training a generative Deep Belief Network model, typically using Contrastive Divergence (CD), then fine-tuning the weights using backpropagation or other discriminative techniques. However, the generative training can be time-consuming due to the slow mixing of Gibbs sampling. We investigated an alternative approach that estimates model expectations of Restricted Boltzmann Machines using samples from a D-Wave quantum annealing machine. We tested this method on a coarse-grained version of the MNIST data set. In our tests we found that the quantum sampling-based training approach achieves comparable or better accuracy with significantly fewer iterations of generative training than conventional CD-based training. Further investigation is needed to determine whether similar improvements can be achieved for other data sets, and to what extent these improvements can be attributed to quantum effects.

Application of Quantum Annealing to Training of Deep Neural Networks

TL;DR

This work investigated an alternative approach that estimates model expectations of Restricted Boltzmann Machines using samples from a D-Wave quantum annealing machine, and found that the quantum sampling- based training approach achieves comparable or better accuracy with significantly fewer iterations of generative training than conventional CD-based training.

Abstract

In Deep Learning, a well-known approach for training a Deep Neural Network starts by training a generative Deep Belief Network model, typically using Contrastive Divergence (CD), then fine-tuning the weights using backpropagation or other discriminative techniques. However, the generative training can be time-consuming due to the slow mixing of Gibbs sampling. We investigated an alternative approach that estimates model expectations of Restricted Boltzmann Machines using samples from a D-Wave quantum annealing machine. We tested this method on a coarse-grained version of the MNIST data set. In our tests we found that the quantum sampling-based training approach achieves comparable or better accuracy with significantly fewer iterations of generative training than conventional CD-based training. Further investigation is needed to determine whether similar improvements can be achieved for other data sets, and to what extent these improvements can be attributed to quantum effects.

Paper Structure

This paper contains 14 sections, 17 equations, 11 figures.

Figures (11)

  • Figure 1: Restricted Boltzmann Machine (RBM), basic building block of Deep Belief Networks
  • Figure 2: Overall training approach including generative and discriminative training
  • Figure 3: Two representations of D-Wave qubit connectivity: (a) Chip layout; (b) "Chimera" graph
  • Figure 4: Embedding of RBM visible and hidden nodes onto D-Wave chip layout
  • Figure 5: Gauge transformations are used to partially mitigate intrinsic control errors
  • ...and 6 more figures