Table of Contents
Fetching ...

Distributed Quantum Computing via Adaptive Circuit Knitting

K. Grace Johnson, Aniello Esposito, Gaurav Gyawali, Xin Zhan, Rohit Ganti, Namit Anand, Raymond G. Beausoleil, Masoud Mohseni

Abstract

Distributing quantum workloads over many Quantum Processing Units (QPUs) is a crucial step in scaling up quantum computers toward practical quantum advantage due to the limitations in size of a single QPU. In the absence of high-fidelity quantum interconnects, circuit knitting could provide a path to computing certain properties of large quantum systems on many QPUs of limited size in a distributed fashion using only classical communication. Circuit knitting partitions large quantum circuits into manageable sub-circuits, however, reconstructing observables in a straightforward manner comes at an exponential cost in sampling and classical post-processing. To mitigate the overhead this technique incurs, we introduce an Adaptive Circuit Knitting (ACK) method that finds efficient partitions of quantum circuits by discovering regions of minimal entanglement between subsystems. We simulate 1D and 2D disordered mixed-field Ising models up to 60 qubits and show that the ACK approach can reduce circuit knitting sampling overheads by up to four orders of magnitude for observables of interest. We highlight our parallel GPU-accelerated implementation and discuss the need for efficient classical simulators to enable distributed quantum algorithm development. Our techniques could enable efficient distribution of quantum simulation for both near-term and fault-tolerant architectures.

Distributed Quantum Computing via Adaptive Circuit Knitting

Abstract

Distributing quantum workloads over many Quantum Processing Units (QPUs) is a crucial step in scaling up quantum computers toward practical quantum advantage due to the limitations in size of a single QPU. In the absence of high-fidelity quantum interconnects, circuit knitting could provide a path to computing certain properties of large quantum systems on many QPUs of limited size in a distributed fashion using only classical communication. Circuit knitting partitions large quantum circuits into manageable sub-circuits, however, reconstructing observables in a straightforward manner comes at an exponential cost in sampling and classical post-processing. To mitigate the overhead this technique incurs, we introduce an Adaptive Circuit Knitting (ACK) method that finds efficient partitions of quantum circuits by discovering regions of minimal entanglement between subsystems. We simulate 1D and 2D disordered mixed-field Ising models up to 60 qubits and show that the ACK approach can reduce circuit knitting sampling overheads by up to four orders of magnitude for observables of interest. We highlight our parallel GPU-accelerated implementation and discuss the need for efficient classical simulators to enable distributed quantum algorithm development. Our techniques could enable efficient distribution of quantum simulation for both near-term and fault-tolerant architectures.
Paper Structure (14 sections, 14 equations, 8 figures, 1 algorithm)

This paper contains 14 sections, 14 equations, 8 figures, 1 algorithm.

Figures (8)

  • Figure 1: (a) An MPS (left) with bond dimension $\chi$ (thick lines) can be converted to a quantum circuit tensor network (right) of order $M$. For exact representation, $M=log(\chi)$. (b) Cutting gates in circuit knitting to form two sub-circuits that can be executed independently. After cutting, sub-circuits must be sampled many times to reconstruct (knit) the observable.
  • Figure 2: Schematic of the adaptive circuit knitting method. In the inner loop, a variational optimizer (see work by Lin et al. lin2021real) finds two-qubit unitary gate parameters U($\theta$) for partitions of a quantum system based on a tensor network in parallel. In the outer loop, an adaptive procedure finds cuts which minimize entanglement between partitions. After the best cuts are found, observables are reconstructed via circuit knitting.
  • Figure 3: Comparison between a GPU-only execution and the hybrid GPU-CPU approach to simulate sub-circuits (a total of 4000) resulting from partitioning a 30-qubit circuit into 10 and 20 qubits. Experiments were carried out on the Nvidia GH200 superchip.
  • Figure 4: (a) Dynamics of an energy density for a clean (top) and disordered (bottom) mixed-field Ising model with disorder in the longitudinal field. An initial energy perturbation diffuses in the clean model, whereas it gets localized in the disordered case. We used $J=1$, $h_z=1.01$, and $\Delta t = 0.25$. $h_x$ was set to 1 for the clean model whereas chosen randomly from $[-W,W]$, $W=2.00$, with a uniform probability for the disordered model. (b) Dynamics of entanglement entropy for 5 random disorder instances are shown in the top panel. In the bottom panel, we show the theoretical lower bound for ACK sample complexity, $\gamma$, which closely follows the entanglement entropy in the top panel. Since the total number of samples required is $O(\gamma^2/\epsilon^2)$, several orders of magnitude advantage can be achieved for simulating non-equilibrium dynamics up to $tJ=50$.
  • Figure 5: Analysis of 40-qubit ACK simulations for time-evolution circuits at $tJ=\{0.0, 0.5, ...,3.5\}$ of the Hamiltonian given in \ref{['eq:time_evo_hamiltonian']} with 64 disorder instances. Here, J is the median among the randomly distributed $J_{j,j+1}$s. The observable considered is the energy density from \ref{['eq:time_evo_edens']} at site $j=13$. Values are compared for a load-balanced cut close to site 20 with a cut guided by entanglement (adaptive). (a) Comparison of circuit knitting sampling overheads at $tJ=3.5$. (b) For a single illustrative disorder instance, the convergence of the energy density as a function of the number of circuit knitting samples, where 10 QPD repetitions were used to estimate the standard deviation. (c) Time evolution of the energy density (averaged over 64 disorder instances) computed with the load-balanced and adaptive knitting strategies (500 circuit knitting samples) compared to the true value from TEBD simulation. (d) Comparison of the knitted (load-balanced or adaptive) energy densities at $tJ=3.5$ using a fixed number of samples (500) compared to the true value computed with the Qiskit MPS simulator. The disorder-averaged energy densities are highlighted by the two bold symbols and the dotted line indicates the true energy density.
  • ...and 3 more figures