Spacetime-Efficient Low-Depth Quantum State Preparation with Applications
Kaiwen Gui, Alexander M. Dalzell, Alessandro Achille, Martin Suchara, Frederic T. Chong
TL;DR
The paper introduces SP+CSP, a deterministic, garbage-free quantum state-preparation method that achieves logarithmic depth and linear spacetime usage when compiled to U(2) and CNOT gates, with near-optimal performance in the H,S,T,CNOT discrete gate set. By partitioning the amplitude encoding into an SP stage followed by a CSP stage, it attains depth Θ(n) with spacetime Θ(2^n) in the exact sense, while enabling SA as low as Θ(2^n) for practical, approximate gate sets. A key innovation is the LOADF, SPF, and FLAG subroutines, which allow efficient parallel angle injection and disentanglement, dramatically reducing the effective ancilla footprint over time and enabling rapid repetition and batching of state preparation. The framework supports preparing multiple copies of independent states with a shared ancilla pool, achieving near-constant per-state depth in multi-copy regimes and enabling applications in quantum machine learning, Hamiltonian simulation via LCU, and quantum linear system solvers with block-encodings. The authors provide gate-level circuit descriptions, pseudocode, Braket implementations, and public-code resources to facilitate adoption and benchmarking across fault-tolerant and near-term architectures.
Abstract
We propose a novel deterministic method for preparing arbitrary quantum states. When our protocol is compiled into CNOT and arbitrary single-qubit gates, it prepares an $N$-dimensional state in depth $O(\log(N))$ and spacetime allocation (a metric that accounts for the fact that oftentimes some ancilla qubits need not be active for the entire circuit) $O(N)$, which are both optimal. When compiled into the $\{\mathrm{H,S,T,CNOT}\}$ gate set, we show that it requires asymptotically fewer quantum resources than previous methods. Specifically, it prepares an arbitrary state up to error $ε$ with optimal depth of $O(\log(N) + \log (1/ε))$ and spacetime allocation $O(N\log(\log(N)/ε))$, improving over $O(\log(N)\log(\log (N)/ε))$ and $O(N\log(N/ε))$, respectively. We illustrate how the reduced spacetime allocation of our protocol enables rapid preparation of many disjoint states with only constant-factor ancilla overhead -- $O(N)$ ancilla qubits are reused efficiently to prepare a product state of $w$ $N$-dimensional states in depth $O(w + \log(N))$ rather than $O(w\log(N))$, achieving effectively constant depth per state. We highlight several applications where this ability would be useful, including quantum machine learning, Hamiltonian simulation, and solving linear systems of equations. We provide quantum circuit descriptions of our protocol, detailed pseudocode, and gate-level implementation examples using Braket.
