Table of Contents
Fetching ...

Efficient Emulation of Neutral Atom Quantum Hardware

Kemal Bidzhiev, Stefano Grava, Pablo le Henaff, Mauro Mendizabal, Elie Merhej, Anton Quelle

TL;DR

This work tackles the challenge of simulating neutral-atom quantum hardware by introducing two emulators, emu-sv (state-vector, exact up to $27$ qubits on GPUs) and emu-mps (MPS-based scalable emulation), integrated with Pasqal's Pulser. emu-sv performs exact time evolution using Lanczos on a discretized, piecewise-constant Hamiltonian, while emu-mps employs a second-order TDVP with MPS/MPO representations to handle large arrays with controlled approximations. Benchmarks show significant speed-ups over the QuTiP backend, with detailed memory and performance analyses guiding users on when to use each emulator. The tools enable efficient precursor simulations and hardware benchmarking for neutral-atom systems, and the authors discuss future directions such as differentiability, DMRGtime evolution, and enhanced noise modeling.

Abstract

Simulating the dynamics of neutral atom arrays is a challenging problem. To address this, we introduce two emulators, emu-sv and emu-mps, as computational backends for Pasqal's pulser package. Emu-sv is designed for high-precision state-vector simulations, giving the possibility to emulate systems of up to $\thicksim 27$ qubits on an A100 40GB GPU, making it perfect for cases where numerically exact results are needed. In contrast, emu-mps uses a Matrix Product State representation and other controlled approximations to efficiently simulate much larger arrays of atoms with manageable errors. We show through benchmark comparisons that both emulators provide significant speed-ups over generic solvers such as QuTiP. In addition, we provide practical guidance on choosing between the two emulators. These quantum software tools are designed to support researchers and developers aiming to simulate quantum systems either as a precursor to full hardware implementation or as a means of benchmarking hardware performance.

Efficient Emulation of Neutral Atom Quantum Hardware

TL;DR

This work tackles the challenge of simulating neutral-atom quantum hardware by introducing two emulators, emu-sv (state-vector, exact up to qubits on GPUs) and emu-mps (MPS-based scalable emulation), integrated with Pasqal's Pulser. emu-sv performs exact time evolution using Lanczos on a discretized, piecewise-constant Hamiltonian, while emu-mps employs a second-order TDVP with MPS/MPO representations to handle large arrays with controlled approximations. Benchmarks show significant speed-ups over the QuTiP backend, with detailed memory and performance analyses guiding users on when to use each emulator. The tools enable efficient precursor simulations and hardware benchmarking for neutral-atom systems, and the authors discuss future directions such as differentiability, DMRGtime evolution, and enhanced noise modeling.

Abstract

Simulating the dynamics of neutral atom arrays is a challenging problem. To address this, we introduce two emulators, emu-sv and emu-mps, as computational backends for Pasqal's pulser package. Emu-sv is designed for high-precision state-vector simulations, giving the possibility to emulate systems of up to qubits on an A100 40GB GPU, making it perfect for cases where numerically exact results are needed. In contrast, emu-mps uses a Matrix Product State representation and other controlled approximations to efficiently simulate much larger arrays of atoms with manageable errors. We show through benchmark comparisons that both emulators provide significant speed-ups over generic solvers such as QuTiP. In addition, we provide practical guidance on choosing between the two emulators. These quantum software tools are designed to support researchers and developers aiming to simulate quantum systems either as a precursor to full hardware implementation or as a means of benchmarking hardware performance.

Paper Structure

This paper contains 13 sections, 6 equations, 8 figures.

Figures (8)

  • Figure 1: Comparison of runtimes for different system sizes $N$ of emu-sv with $dt=5, 10$ and Pulser. Pulsers default backend QuTiP uses ZVODE ODE solver. From about 9 qubits onwards, runtime approximately doubles for each extra qubit for Pulser, which is to be expected matrix-vector multiplication starts dominating the runtime of the program. Emu-sv uses pytorch native parallelization tools with number of threads = 16. The same exponential scaling as for QuTiP sets in for emu-sv (see Fig. \ref{['fig:sv-gpu']}), but it does so later because the solver is more efficient for larger matrix sizes.
  • Figure 2: Comparison of runtimes for emu-sv when running on GPU and CPU with 1, 2, and 16 threads. Due to high parallelism, larger system instances are executed faster on GPUs. However, the situation is the opposite for smaller system sizes, the individual cores in a GPU are slower, and the parallelism cannot be fully leveraged.
  • Figure 3: Time evolution of the norm difference between the pulser and emu-sv wavefunctions. The precision parameter is fixed at $p = 10^{-12}$ to highlight that the error is governed predominantly by the discretization scheme, determined solely by the time step $dt$. Each application of the time evolution with a given $dt$ introduces an error that accumulates over time.
  • Figure 4: Norm of the difference between pulser and emu-sv wavefunctions at the end of the time evolution as a function of the precision parameter $p$, evaluated over a logarithmic range from $10^{-10}$ to $10^{-3}$. Each different curve corresponds to time step size $dt$, illustrating the dependence of the error on both the precision parameter and the time discretization.
  • Figure 5: Upper bound on the memory used by emu-mps to simulate a given number of qubits $N$ with a given maximum bond dimension $\chi$emulators_2025. The number of Lanczos iterations required for convergence is set to $k = 30$ for this plot. This bound is slightly pessimistic, and actual values depend on the true number of iterations, which are influenced by $dt$ and the tolerance for convergence of the Lanczos algorithm.
  • ...and 3 more figures