Table of Contents
Fetching ...

Learning a quantum computer's capability

Daniel Hothem, Kevin Young, Tommie Catanach, Timothy Proctor

TL;DR

This work proposes a hardware-agnostic method to efficiently construct scalable predictive models of a quantum computer's capability for almost any class of circuits and demonstrates the method using convolutional neural networks (CNNs).

Abstract

Accurately predicting a quantum computer's capability -- which circuits it can run and how well it can run them -- is a foundational goal of quantum characterization and benchmarking. As modern quantum computers become increasingly hard to simulate, we must develop accurate and scalable predictive capability models to help researchers and stakeholders decide which quantum computers to build and use. In this work, we propose a hardware-agnostic method to efficiently construct scalable predictive models of a quantum computer's capability for almost any class of circuits, and demonstrate our method using convolutional neural networks (CNNs). Our CNN-based approach works by efficiently representing a circuit as a three-dimensional tensor and then using a CNN to predict its success rate. Our CNN capability models obtain approximately a $1\%$ average absolute prediction error when modeling processors experiencing both Markovian and non-Markovian stochastic Pauli errors. We also apply our CNNs to model the capabilities of cloud-access quantum computing systems, obtaining moderate prediction accuracy (average absolute error around $2-5\%$), and we highlight the challenges to building better neural network capability models.

Learning a quantum computer's capability

TL;DR

This work proposes a hardware-agnostic method to efficiently construct scalable predictive models of a quantum computer's capability for almost any class of circuits and demonstrates the method using convolutional neural networks (CNNs).

Abstract

Accurately predicting a quantum computer's capability -- which circuits it can run and how well it can run them -- is a foundational goal of quantum characterization and benchmarking. As modern quantum computers become increasingly hard to simulate, we must develop accurate and scalable predictive capability models to help researchers and stakeholders decide which quantum computers to build and use. In this work, we propose a hardware-agnostic method to efficiently construct scalable predictive models of a quantum computer's capability for almost any class of circuits, and demonstrate our method using convolutional neural networks (CNNs). Our CNN-based approach works by efficiently representing a circuit as a three-dimensional tensor and then using a CNN to predict its success rate. Our CNN capability models obtain approximately a average absolute prediction error when modeling processors experiencing both Markovian and non-Markovian stochastic Pauli errors. We also apply our CNNs to model the capabilities of cloud-access quantum computing systems, obtaining moderate prediction accuracy (average absolute error around ), and we highlight the challenges to building better neural network capability models.
Paper Structure (40 sections, 34 equations, 9 figures, 1 table)

This paper contains 40 sections, 34 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Modelling the effect of stochastic errors with CNNs (cont.) Quantifying the prediction error of CNNs and f-ERMs (fit error rates models) trained on simulated data from few-qubit random circuits with a stochastic Pauli errors model (see the caption of Fig. \ref{['fig:markovian-predictions']} for details). (a) The KL divergence and (b) the $L^1$ error for the CNN's predictions (outer squares) and f-ERM's predictions (inner squares), averaged over multiple randomly sub-sampled datasets of each size. The CNN's prediction accuracy surpasses that of the f-ERM in the region above the white line. In contrast with the f-ERM, we observe that the accuracy of the CNNs continues to increase up to the largest dataset size we used.
  • Figure 2: Predictions for non-Markovian errors. The prediction accuracy of CNNs trained on simulated data from random circuits for a hypothetical 49-qubit system with three different models. (a) A Pauli stochastic error model, with gate- and qubit-dependent error rates. This model forms the basis for two non-Markovian models in which (b) a two-qubit gate's error rate increases if it is preceded by a two-qubit gate on either of its qubits, and (c) gate errors increase over the duration of a circuit. Each main plot within (a-c) shows $s(c)$ versus the prediction error [$\delta(c)$ defined in Eq. \ref{['eq:predict-error']}] on test data, for a CNN and an ERM fit to the same dataset (f-ERM). The CNN and f-ERM have similar prediction error for the Markovian model, but the CNN significantly outperforms the f-ERM for both non-Markovian models. This is summarized by the $\delta(c)$ histograms [lower right, (a-c)] as well as the KL divergence and $L^1$ error for each model [upper right, (a-c)]. Each subfigure also contains a histogram of the test data [upper left, (a-c)].
  • Figure 3: Generalizing to out-of-distribution circuits. Two examples of generalizing a neural network to predict circuits that are drawn from a distribution that differs significantly from the training distribution. (a) The prediction accuracy [$\delta(c)$] of a CNN (n-CNN, purple) trained on narrow circuits ($w \leq 25$ qubits) evaluated on a test dataset containing wide circuits ($w > 25$ qubits). The data was simulated under an error model in which gate errors increase with circuit depth, so narrow circuits reveal all important aspects of the error model, meaning that accurate generalization to wider circuits is feasible. The prediction error $\delta(c)$ for n-CNN is much smaller than for an ERM fit to the same training data (nf-ERM, pink). However, $\delta(c)$ is larger for n-CNN than for a CNN trained on a dataset containing both narrow and wide circuits (a-CNN, orange). (b) The prediction error of a CNN (r-CNN, purple) trained on random circuits and evaluated on periodic circuits. The r-CNN has an $L^1$ error of $d_{L^1} = 0.005$ on test data drawn from the same distribution as the training data (random circuits), but it increases to $d_{L^1} = 0.052$ on periodic circuit data, resulting in worse predictions than provided by an ERM fit to the same training data (rf-ERM, blue). However, the CNN's performance can be substantially improved by re-training the CNN on periodic circuit training data (p-CNN, yellow), while maintaining the same architecture that was found by hyperparameter tuning using the random circuits data.
  • Figure 4: Inaccurate capability models when errors are purely coherent. The prediction accuracy of a CNN and a f-ERM trained on randomized mirror circuit data ($n=5$ qubits) generated from an error model with purely coherent (i.e., Hamiltonian) errors. We find that neither model accurately predicts the test data ($d_{L^1} \approx 0.06$ for both models). Predicting $s(c)$ in the presence of coherent errors is challenging because coherent errors can add or cancel across an entire circuit.
  • Figure 5: Error sensitivity information improves model accuracy. The prediction error of CNNs trained with and without the error sensitivity channels---which we use to encode information about each qubit's sensitivity to the three single-qubit Pauli errors at each circuit location---on randomized mirror circuit data ($n=5$ qubits) simulated under a Pauli stochastic error model. We observe significantly better performance (the KL divergence is an order of magnitude smaller) for the CNN that has access to the error sensitivity channels. This suggests that error sensitivity information is important for accurate capability learning.
  • ...and 4 more figures