Harnessing CUDA-Q's MPS for Tensor Network Simulations of Large-Scale Quantum Circuits
Gabin Schieffer, Stefano Markidis, Ivy Peng
TL;DR
This work evaluates CUDA-Q’s tensor-network backends, focusing on Matrix Product State (MPS) representations, to enable large-qubit quantum circuit simulations on a single GPU. By comparing state-vector, exact tensor-network, and MPS backends on a Grace Hopper system across five representative circuits, the study shows that SV remains fastest when feasible, but TN and especially MPS enable simulations beyond SV memory limits, reaching up to about 60–90 qubits depending on circuit structure. Profiling reveals that SVD iterations in the MPS approach offer substantial contraction-time reductions, though GPU utilization is uneven and Tensor Cores are underused for these workloads. The work also investigates the impact of MPS approximation via bond-dimension controls ($\\chi_{max}$), demonstrating that accuracy can be preserved for key outcomes at moderate $\\chi$ values, with explicit validation on a 10-qubit QAOA circuit. Overall, the results highlight the practical potential of MPS-based quantum circuit simulation on commodity GPUs and outline future directions for deeper integration with broader quantum software stacks.
Abstract
Quantum computer simulators are an indispensable tool for prototyping quantum algorithms and verifying the functioning of existing quantum computer hardware. The current largest quantum computers feature more than one thousand qubits, challenging their classical simulators. State-vector quantum simulators are challenged by the exponential increase of representable quantum states with respect to the number of qubits, making more than fifty qubits practically unfeasible. A more appealing approach for simulating quantum computers is adopting the tensor network approach, whose memory requirements fundamentally depend on the level of entanglement in the quantum circuit, and allows simulating the current largest quantum computers. This work investigates and evaluates the CUDA-Q tensor network simulators on an Nvidia Grace Hopper system, particularly the Matrix Product State (MPS) formulation. We compare the performance of the CUDA-Q state vector implementation and validate the correctness of MPS simulations. Our results highlight that tensor network-based methods provide a significant opportunity to simulate large-qubit circuits, albeit approximately. We also show that current GPU-accelerated computation cannot fully utilize GPU efficiently in the case of MPS simulations.
