Hybrid action Reinforcement Learning for quantum architecture search
Jiayang Niu, Yan Wang, Jie Li, Ke Deng, Azadeh Alavi, Mark Sanderson, Yongli Ren
TL;DR
This work tackles automated quantum architecture search for variational quantum circuits in the VQE setting by introducing HyRLQAS, a hybrid-action reinforcement learning framework that jointly learns discrete gate placement and continuous parameter initialization. The method uses a tensor-based circuit encoding and a Hybrid Policy Network to produce a discrete action, an initial parameter, and a refinement update, with REINFORCE optimization and a curriculum-driven reward based on energy reduction. Key contributions include the unified hybrid action space, a learned warm-start mechanism for external optimizers, and extensive ablations showing gains from both topology and initialization. Experiments on molecular Hamiltonians demonstrate improved ground-state energy accuracy and more compact circuits, indicating a principled path toward automated, hardware-efficient quantum circuit design in the NISQ era.
Abstract
Designing expressive yet trainable quantum circuit architectures remains a major challenge for variational quantum algorithms, as manual or heuristic designs often yield suboptimal performance. We propose HyRLQAS (Hybrid-Action Reinforcement Learning for Quantum Architecture Search), a unified framework that integrates discrete gate placement and continuous parameter generation within a hybrid action space. Unlike existing approaches that optimize circuit structure and parameters separately, HyRLQAS jointly learns both topology and initialization while dynamically refining previously placed gates through reinforcement learning. Trained in a variational quantum eigensolver (VQE) environment, the agent autonomously constructs circuits that minimize molecular ground-state energy. Experimental results demonstrate that HyRLQAS achieves consistently lower energy errors and more compact circuit structures compared with discrete-only and continuous-only baselines. Furthermore, the hybrid action space yields superior parameter initializations, producing post-optimization energy distributions with consistently lower minima. These findings suggest that hybrid-action reinforcement learning offers a principled pathway toward automated and hardware-efficient quantum circuit design.
