Simulating magic state cultivation with few Clifford terms
Kwok Ho Wan, Zhenghao Zhong
TL;DR
The paper develops cutting-based stabiliser decomposition techniques to simulate $d=5$ magic state cultivation circuits with a small average Clifford-term count (~8) under realistic edge Pauli errors, dramatically reducing the previously intractable term growth (millions of terms) inherent in magic-cat stabiliser decompositions. By combining spider cutting, BSS decomposition, and optimized numerical pipelines integrated with tsim, the authors demonstrate near-constant-term overhead for the $d=3$ circuit and outline a feasible path to exact Monte Carlo estimates for $d=5$ without substituting non-Clifford gates. The work introduces several optimization layers—parametric ZX-diagrams, split f/m row-sums, BLAS-accelerated computations, LUT-based arithmetic, and full-program JIT compilation—that collectively achieve high-throughput simulation (up to ~132k shots/s on a laptop) and provide end-to-end strategies for evaluating logical error rates in near-Clifford, high-$T$-count quantum circuits. These methods offer practical insights into the simulability of real-world quantum circuits with internal structure and lay groundwork for applying cutting-based stabiliser decompositions to larger quantum-error-corrected architectures.
Abstract
Building upon [arXiv:2509.01224], we present a few methods on how to simulate the non-Clifford $d=5$ magic state cultivation circuits [arXiv:2409.17595] with a sum of $\approx 8$ Clifford ZX-diagrams on average, at $0.1\%$ noise. Compared to a magic cat state stabiliser decomposition of all $53$ non-Clifford spiders ($6{,}377{,}292$ terms required), this is more than $7 \times 10^{5}$ times reduction in the number of terms. Our stabiliser decomposition has the advantage of representing the final non-Clifford state (in light of circuit errors) as a sum of Clifford ZX-diagrams. This will be useful in simulating the escape stage of magic state cultivation, where one needs to port the resultant state of cultivation into a larger Clifford circuit with many more qubits. Still, it's necessary to only track $\approx 8$ Clifford terms. Our result sheds light on the simulability of operationally relevant, high $T$-count quantum circuits with some internal structure. Finally, we provide numerical results for full non-Clifford stabiliser rank simulation based on $\mathtt{tsim}$ along with optimisations using our cutting decompositions. Nearly $132,400$ shots per second can be obtained on a laptop for the smaller $d = 3$ circuits at SD6 circuit level noise $p=0.0005$, making it only $\sim 34$ times slower than its fully Clifford proxy simulation via $\mathtt{stim}$ using $S$ gates.
