Table of Contents
Fetching ...

Pareto-Efficient Quantum Circuit Simulation Using Tensor Contraction Deferral

Edwin Pednault, John A. Gunnels, Giacomo Nannicini, Lior Horesh, Thomas Magerlein, Edgar Solomonik, Erik W. Draeger, Eric T. Holland, Robert Wisnieff

TL;DR

The paper tackles the exponential barrier in classically simulating large quantum circuits by introducing contraction deferral within a tensor-network framework, augmented with tensor slicing and a memory hierarchy that leverages secondary storage. It demonstrates substantial memory reductions and enables deeper circuit simulations (notably for $7×7$ qubits at depth 27 and $8×7$ at depth 23) by partitioning circuits into subcircuits and deferring certain contractions. The authors develop optimization strategies (hand-crafted partitioning, A* search, and integer programming) and validate their approach with large-scale simulations on the Vulcan supercomputer, verifying Porter-Thomas statistics across slices. Their results extend the practical boundary of classical simulation for benchmarking and verification of quantum devices, and they outline future work on further integrating sliced and non-sliced contraction deferral and automated scheme selection.

Abstract

With the current rate of progress in quantum computing technologies, systems with more than 50 qubits will soon become reality. Computing ideal quantum state amplitudes for circuits of such and larger sizes is a fundamental step to assess both the correctness, performance, and scaling behavior of quantum algorithms and the fidelities of quantum devices. However, resource requirements for such calculations on classical computers grow exponentially. We show that deferring tensor contractions can extend the boundaries of what can be computed on classical systems. To demonstrate this technique, we present results obtained from a calculation of the complete set of output amplitudes of a universal random circuit with depth 27 in a 2D lattice of $7 \times 7$ qubits, and an arbitrarily selected slice of $2^{37}$ amplitudes of a universal random circuit with depth 23 in a 2D lattice of $8 \times 7$ qubits. Combining our methodology with other decomposition approaches found in the literature, we show that we can simulate $7 \times 7$-qubit random circuits to arbitrary depth by leveraging secondary storage. These calculations were thought to be impossible due to resource requirements.

Pareto-Efficient Quantum Circuit Simulation Using Tensor Contraction Deferral

TL;DR

The paper tackles the exponential barrier in classically simulating large quantum circuits by introducing contraction deferral within a tensor-network framework, augmented with tensor slicing and a memory hierarchy that leverages secondary storage. It demonstrates substantial memory reductions and enables deeper circuit simulations (notably for qubits at depth 27 and at depth 23) by partitioning circuits into subcircuits and deferring certain contractions. The authors develop optimization strategies (hand-crafted partitioning, A* search, and integer programming) and validate their approach with large-scale simulations on the Vulcan supercomputer, verifying Porter-Thomas statistics across slices. Their results extend the practical boundary of classical simulation for benchmarking and verification of quantum devices, and they outline future work on further integrating sliced and non-sliced contraction deferral and automated scheme selection.

Abstract

With the current rate of progress in quantum computing technologies, systems with more than 50 qubits will soon become reality. Computing ideal quantum state amplitudes for circuits of such and larger sizes is a fundamental step to assess both the correctness, performance, and scaling behavior of quantum algorithms and the fidelities of quantum devices. However, resource requirements for such calculations on classical computers grow exponentially. We show that deferring tensor contractions can extend the boundaries of what can be computed on classical systems. To demonstrate this technique, we present results obtained from a calculation of the complete set of output amplitudes of a universal random circuit with depth 27 in a 2D lattice of qubits, and an arbitrarily selected slice of amplitudes of a universal random circuit with depth 23 in a 2D lattice of qubits. Combining our methodology with other decomposition approaches found in the literature, we show that we can simulate -qubit random circuits to arbitrary depth by leveraging secondary storage. These calculations were thought to be impossible due to resource requirements.

Paper Structure

This paper contains 22 sections, 13 equations, 19 figures, 9 tables, 1 algorithm.

Figures (19)

  • Figure 1: Example of a tensor network for the expression $\sum_j A_{i,j} v_j$.
  • Figure 2: Example of the proposed tensor network representation of a quantum circuit (left), using a hypergraph (right). The $T$ and $CZ$ gates are diagonal. Dashed edges are hyperedges, solid edges are "regular" edges, i.e., hyperedges of cardinality two.
  • Figure 3: Example of the proposed tensor network representation of a quantum circuit (left), using a hypergraph (right). The $T$ gates are diagonal, $i$SWAP is not (although it is separable). Dashed edges are hyperedges, solid edges are "regular" edges; i.e., hyperedges of cardinality two.
  • Figure 4: Example from pednault2017blog extended to illustrate the complete partitioning of a quantum circuit into "bristle-brush" subcircuits divided along qubit lines. The dashed lines correspond to entanglement indices that are shared between tensors constructed for each qubit line. Figures \ref{['fig:googleex']} and \ref{['fig:googleexsplit']} in the Supplementary Information describe alternative ways of partitioning this circuit that yield different computation/memory trade-offs during simulation.
  • Figure 5: (a) Left. An example of contraction deferral: the contraction operation on index $j_1$ in the tensor network of Fig. \ref{['fig:hypergraph_example']} is deferred. Standard adjacent contractions are performed to construct the upper tensor $U_{i_0,j_1} = T_{i_0} CZ_{i_0j_1,i_0j_1} T_{i_0} \delta_{i_0}$ (we denote each tensor by the name of the corresponding gate, and denote the single-index Kronecker delta by $\delta_k$). A non-adjacent contraction is performed on the Hadamard gates to construct the lower tensor $L_{k_1,j_1} = H_{k_1,j_1} \sum_{i_1 \in \{0,1\}}{H_{j_1,i_1} \delta_{i_1}}$. Contraction deferral allows the top and bottom tensors, $U_{i_0,j_1}$ and $L_{k_1,j_1}$, to be computed independently. The final state is obtained by contracting $j_1$. We refer to indices whose contractions are deferred as entanglement indices, in recognition of the fact that they account for the entanglements that exist among subcircuits while allowing those subcircuits to be simulated independently. In this example, $j_1$ is an entanglement index. (b) Right. An example of sliced contraction deferral: a deferred contraction is combined with the slicing of entanglement index $j_1$. For each $j_1 \in \{0,1\}$, the top and bottom tensors, $U_{i_0,j_1}$ and $L_{k_1,j_1}$, are computed independently and their values are multiplied together. The resulting products are then summed over the values of $j_1$. The contraction operation on $j_1$ is thus accomplished iteratively. Because $j_1$ is fixed at each iteration, the tensors $U_{i_0,j_1}$ and $L_{k_1,j_1}$ are sliced on $j_1$ and the amount of memory needed to store the slices is cut in half.
  • ...and 14 more figures