Quantum circuit optimization with deep reinforcement learning
Thomas Fösel, Murphy Yuezhen Niu, Florian Marquardt, Li Li
TL;DR
The paper tackles hardware-aware quantum circuit optimization for NISQ devices by introducing a deep reinforcement learning framework that treats circuit transformations as actions in an RL environment. The agent is a deep convolutional network trained with PPO (AAC) to select soft transformations, followed by pruning of hard transformations, guided by a reward function that penalizes circuit depth and gate count. On 12-qubit random circuits, the method achieves about 27% depth and 15% gate-count reductions and demonstrates extrapolation to larger circuits (up to 50 qubits) and application to QAOA-MaxCut. Compared with simulated annealing, RL offers faster optimization after training and can generalize to architectures not seen during training, suggesting a practical path to hardware-aware QCO for near-term quantum devices.
Abstract
A central aspect for operating future quantum computers is quantum circuit optimization, i.e., the search for efficient realizations of quantum algorithms given the device capabilities. In recent years, powerful approaches have been developed which focus on optimizing the high-level circuit structure. However, these approaches do not consider and thus cannot optimize for the hardware details of the quantum architecture, which is especially important for near-term devices. To address this point, we present an approach to quantum circuit optimization based on reinforcement learning. We demonstrate how an agent, realized by a deep convolutional neural network, can autonomously learn generic strategies to optimize arbitrary circuits on a specific architecture, where the optimization target can be chosen freely by the user. We demonstrate the feasibility of this approach by training agents on 12-qubit random circuits, where we find on average a depth reduction by 27% and a gate count reduction by 15%. We examine the extrapolation to larger circuits than used for training, and envision how this approach can be utilized for near-term quantum devices.
