Software for Creating Scalable Benchmarks from Quantum Algorithms
Noah Siekierski, Stefan Seritan, Neer Patel, Siyuan Niu, Thomas Lubinski, Timothy Proctor
TL;DR
Scarab tackles scalable, reliable quantum benchmarking by providing a user-friendly tool that converts arbitrary circuits into scalable benchmarks via process fidelity estimated with mirror circuit fidelity estimation (MCFE). Built as a module in pyGSTi, scarab supports low-level, full-stack, and subcircuit benchmarks, enabling robust evaluation from circuits spanning thousands to millions of qubits. Through simulations and experiments on Hamiltonian-simulation tasks, compiler testing, and subcircuit extrapolations, scarab demonstrates accurate fidelity estimation and actionable insights into hardware–algorithm trade-offs. The work delivers open-source tooling that standardizes scalable benchmark design and analysis for contemporary and future quantum architectures.
Abstract
Creating scalable, reliable, and well-motivated benchmarks for quantum computers is challenging: straightforward approaches to benchmarking suffer from exponential scaling, are insensitive to important errors, or use poorly-motivated performance metrics. Furthermore, curated benchmarking suites cannot include every interesting quantum circuit or algorithm, which necessitates a tool that enables the easy creation of new benchmarks. In this work, we introduce a software tool for creating scalable and reliable benchmarks that measure a well-motivated performance metric (process fidelity) from user-chosen quantum circuits and algorithms. Our software, called $\texttt{scarab}$, enables the creation of efficient and robust benchmarks even from circuits containing thousands or millions of qubits, by employing efficient fidelity estimation techniques, including mirror circuit fidelity estimation and subcircuit volumetric benchmarking. $\texttt{scarab}$ provides a simple interface that enables the creation of reliable benchmarks by users who are not experts in the theory of quantum computer benchmarking or noise. We demonstrate the flexibility and power of $\texttt{scarab}$ by using it to turn existing inefficient benchmarks into efficient benchmarks, to create benchmarks that interrogate hardware and algorithmic trade-offs in Hamiltonian simulation, to quantify the in-situ efficacy of approximate circuit compilation, and to create benchmarks that use subcircuits to measure progress towards executing a circuit of interest.
