Table of Contents
Fetching ...

Quantum embeddings for machine learning

Seth Lloyd, Maria Schuld, Aroosa Ijaz, Josh Izaac, Nathan Killoran

TL;DR

The paper reframes quantum machine learning as metric learning in Hilbert space: inputs are embedded into quantum states, and the embedding is trained to maximize class separation under a chosen quantum distance. Once the embedding is optimized, the corresponding optimal measurement (Helstrom for trace distance or fidelity-based for Hilbert-Schmidt distance) is known, enabling potentially shallower, more hardware-friendly quantum classifiers. The authors formalize the HS-based training objective, connect it to MMD, and provide practical circuit constructions (SWAP/inversion tests) alongside numerical demonstrations in PennyLane. They also assess near-term hardware feasibility, arguing that high-dimensional quantum embeddings could be approachable on existing devices, though the existence and magnitude of quantum advantage remain open questions.

Abstract

Quantum classifiers are trainable quantum circuits used as machine learning models. The first part of the circuit implements a quantum feature map that encodes classical inputs into quantum states, embedding the data in a high-dimensional Hilbert space; the second part of the circuit executes a quantum measurement interpreted as the output of the model. Usually, the measurement is trained to distinguish quantum-embedded data. We propose to instead train the first part of the circuit -- the embedding -- with the objective of maximally separating data classes in Hilbert space, a strategy we call quantum metric learning. As a result, the measurement minimizing a linear classification loss is already known and depends on the metric used: for embeddings separating data using the l1 or trace distance, this is the Helstrom measurement, while for the l2 or Hilbert-Schmidt distance, it is a simple overlap measurement. This approach provides a powerful analytic framework for quantum machine learning and eliminates a major component in current models, freeing up more precious resources to best leverage the capabilities of near-term quantum information processors.

Quantum embeddings for machine learning

TL;DR

The paper reframes quantum machine learning as metric learning in Hilbert space: inputs are embedded into quantum states, and the embedding is trained to maximize class separation under a chosen quantum distance. Once the embedding is optimized, the corresponding optimal measurement (Helstrom for trace distance or fidelity-based for Hilbert-Schmidt distance) is known, enabling potentially shallower, more hardware-friendly quantum classifiers. The authors formalize the HS-based training objective, connect it to MMD, and provide practical circuit constructions (SWAP/inversion tests) alongside numerical demonstrations in PennyLane. They also assess near-term hardware feasibility, arguing that high-dimensional quantum embeddings could be approachable on existing devices, though the existence and magnitude of quantum advantage remain open questions.

Abstract

Quantum classifiers are trainable quantum circuits used as machine learning models. The first part of the circuit implements a quantum feature map that encodes classical inputs into quantum states, embedding the data in a high-dimensional Hilbert space; the second part of the circuit executes a quantum measurement interpreted as the output of the model. Usually, the measurement is trained to distinguish quantum-embedded data. We propose to instead train the first part of the circuit -- the embedding -- with the objective of maximally separating data classes in Hilbert space, a strategy we call quantum metric learning. As a result, the measurement minimizing a linear classification loss is already known and depends on the metric used: for embeddings separating data using the l1 or trace distance, this is the Helstrom measurement, while for the l2 or Hilbert-Schmidt distance, it is a simple overlap measurement. This approach provides a powerful analytic framework for quantum machine learning and eliminates a major component in current models, freeing up more precious resources to best leverage the capabilities of near-term quantum information processors.

Paper Structure

This paper contains 13 sections, 10 equations, 6 figures.

Figures (6)

  • Figure 1: Illustration of quantum metric learning. a. The embedding is trained to maximize the distance of the data clusters in the Hilbert space of quantum states. b. The measurement used to classify new inputs depends on the distance measure used. The simple decision boundary in Hilbert space can correspond to a highly complex decision boundary in the original data space.
  • Figure 2: Decision boundary on a $2$-d moons dataset for the Helstrøm and fidelity classifiers, training the embedding with the trace and Hilbert-Schmidt distance, respectively. The embedding was trained for $500$ steps with an RMSProp optimizer and batch size $5$, using a $2$ qubit QAOA feature map of $4$ layers, and reaching a final cost of $0.28$ for $\ell_1$ and $0.55$ for $\ell_2$ training (see Section \ref{['Sec:practical']}). Although the details of the plots vary with the hyperparameters of the training, the example illustrates that both classifiers give rise to an overall similar, but not identical decision boundary. Consistent with state discrimination theory, one can also see that the Helstrøm classifier has outputs closer to the extremes $1$ and $-1$. The embedding was trained in PennyLane, and the classifiers were simulated analytically.
  • Figure 3: Ansatz for a single trainable layer of the embedding used in the experiments, showing $3$ inputs $x_1, x_2, x_3$, as well as trainable ZZ-entanglers and $R_Y$ rotations. Latent qubits, as shown with the fourth qubit (red), can be added to increase the dimension of the Hilbert space. The quantum embedding consists of several such layers, and a final repetition of the feature encoding rotations.
  • Figure 4: Illustrative example of a quantum embedding and classification for a non-overlapping, but not linearly separable, one-dimensional dataset. Training was done using the cost $C$ defined in Section \ref{['Sec:practical']}, and an RMSProp optimizer with initial learning rate $0.01$ and batch size $2$. a. The untrained feature map distributes the data arbitrarily on the Bloch sphere, while after $200$ steps of training the classes are well-separated. b. The fidelity classifier draws a linear decision boundary on the bloch sphere, which translates to two linear decision boundaries in the original space. The simulations were done in the PennyLane software framework bergholm2018pennylane.
  • Figure 5: Hybrid quantum-classical embedding. An image is fed into a pre-trained ResNet. The last layer is replaced by a linear layer, transforming the $512$-dimensional output to a $2$ dimensional feature vector, which is fed into a parametrized quantum circuit. The circuit parameters are trained together with the final classical layer (red). After $1500$ training steps with an Adagrad optimizer, the classical weights learn to map the two classes to periodically arranged points in the intermediate feature space, which allows the quantum circuit to perfectly separate the two classes in quantum feature space. Each optimization step uses a batch of $2$ randomly sampled training points. The simulations were done in PennyLane.
  • ...and 1 more figures

Theorems & Definitions (7)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Definition 7