Table of Contents
Fetching ...

QASER: Breaking the Depth vs. Accuracy Trade-Off for Quantum Architecture Search

Ioana Moflic, Alexandru Paler, Akash Kundu

TL;DR

The paper tackles the depth–accuracy trade-off in quantum circuit design by introducing QASER, an exponential, multi-objective reward for reinforcement-learning–based quantum architecture search. By tracking the historical maxima of depth and gate cost and coupling them with energy via an exponent, QASER steers the search toward circuits that are simultaneously shallow, resource-efficient, and accurate. Empirical results on quantum chemistry ground-state preparation show up to 20% fewer 2-qubit gates, reduced circuit depth, and up to 50% gains in accuracy compared to state-of-the-art RL-QAS methods, under both noisy and noiseless conditions, with further acceleration observed in warm-start TensorRL-QAS. The findings highlight reward engineering as a potent lever to improve hardware-aware quantum compilation, suggesting practical gains for post-NISQ implementations and scalable QAS workflows.

Abstract

Quantum computing faces a key challenge: balancing the need for low circuit depth (crucial for fault tolerance) with the high accuracy required for complex computations like quantum chemistry and error correction, which typically require deeper circuits. We overcome this trade-off by introducing a novel reinforcement learning approach featuring engineered reward functions, called \textbf{QASER}, that take into account seemingly contradictory optimization goals. This reward enables the compilation of circuits with lower depth and higher accuracy, significantly outperforming state-of-the-art techniques. Benchmarks on quantum chemistry state preparation circuits demonstrate stable compilations. We achieve up to 50\% improved accuracy, while reducing 2-qubit gate counts and depths by 20\%. This advancement enables more efficient and reliable quantum compilation.

QASER: Breaking the Depth vs. Accuracy Trade-Off for Quantum Architecture Search

TL;DR

The paper tackles the depth–accuracy trade-off in quantum circuit design by introducing QASER, an exponential, multi-objective reward for reinforcement-learning–based quantum architecture search. By tracking the historical maxima of depth and gate cost and coupling them with energy via an exponent, QASER steers the search toward circuits that are simultaneously shallow, resource-efficient, and accurate. Empirical results on quantum chemistry ground-state preparation show up to 20% fewer 2-qubit gates, reduced circuit depth, and up to 50% gains in accuracy compared to state-of-the-art RL-QAS methods, under both noisy and noiseless conditions, with further acceleration observed in warm-start TensorRL-QAS. The findings highlight reward engineering as a potent lever to improve hardware-aware quantum compilation, suggesting practical gains for post-NISQ implementations and scalable QAS workflows.

Abstract

Quantum computing faces a key challenge: balancing the need for low circuit depth (crucial for fault tolerance) with the high accuracy required for complex computations like quantum chemistry and error correction, which typically require deeper circuits. We overcome this trade-off by introducing a novel reinforcement learning approach featuring engineered reward functions, called \textbf{QASER}, that take into account seemingly contradictory optimization goals. This reward enables the compilation of circuits with lower depth and higher accuracy, significantly outperforming state-of-the-art techniques. Benchmarks on quantum chemistry state preparation circuits demonstrate stable compilations. We achieve up to 50\% improved accuracy, while reducing 2-qubit gate counts and depths by 20\%. This advancement enables more efficient and reliable quantum compilation.

Paper Structure

This paper contains 15 sections, 7 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Our QASER exponential reward function is denser than those commonly used in the RL literature for QAS (e.g. the green surface). QASER (the orange meshed surface, e.g. Eq. \ref{['eq:exp_reward_orig']}) captures multiple costs, offering a carefully tailored reward signal to the RL agent. The exponential nature of QASER ensures that the best reward signals are for circuits which exhibit simultaneous low depth and low energy.
  • Figure 2: QASER vs CRLQAS. (a) QASER outperforms CRLQAS in the realistic noisy scenario in finding the ground state of $6-\texttt{LiH}$, and $8-\texttt{H}_2\texttt{O}$ molecules. (b) QASER exhibits accelerated reward accumulation, converging to approximately $10^{4}$ reward units, while CRLQAS demonstrates more gradual learning dynamics, reaching a plateau at approximately $10^3$. QASER shows faster convergence and a better reward signal when compared to CRLQAS in finding the ground state of a $10-\texttt{H}_2\texttt{O}$ molecule.
  • Figure 3: A typical episode in TensorRL with two rewards: QASER (Eq. \ref{['eq:exponential_reward_increment']}) and the linear reward (Eq. \ref{['eq:linear_reward_increment']}). QASER achieves faster and more stable convergence to low energy estimations, although it is not starting from the MPS state.