Differentiable Quantum Architecture Search in Asynchronous Quantum Reinforcement Learning
Samuel Yen-Chi Chen
TL;DR
The paper tackles the challenge of designing effective quantum circuit architectures for quantum reinforcement learning (QRL), where hand-crafted VQC designs demand deep quantum expertise. It introduces Differentiable Quantum Architecture Search (DiffQAS) integrated with asynchronous QA3C, enabling gradient-based optimization of both circuit parameters $\vec{\theta}$ and structure weights $w_j$ in the quantum function $F(\vec{x};\theta)=\sum_j w_j f_j(\vec{x};\theta_j)$ to produce the Q-function $Q(\vec{s};\Theta)=G_\eta \circ F_\theta \circ H_\delta(\vec{s})$. Empirical results on MiniGrid environments show that DiffQAS-QRL can reach performance comparable to or better than manually designed VQC configurations while delivering improved training stability, particularly in harder tasks where some handcrafted baselines fail to learn. Overall, the approach automates quantum circuit design for QRL, reducing reliance on specialized quantum engineering and enabling broader, task-agnostic application of QRL methods.
Abstract
The emergence of quantum reinforcement learning (QRL) is propelled by advancements in quantum computing (QC) and machine learning (ML), particularly through quantum neural networks (QNN) built on variational quantum circuits (VQC). These advancements have proven successful in addressing sequential decision-making tasks. However, constructing effective QRL models demands significant expertise due to challenges in designing quantum circuit architectures, including data encoding and parameterized circuits, which profoundly influence model performance. In this paper, we propose addressing this challenge with differentiable quantum architecture search (DiffQAS), enabling trainable circuit parameters and structure weights using gradient-based optimization. Furthermore, we enhance training efficiency through asynchronous reinforcement learning (RL) methods facilitating parallel training. Through numerical simulations, we demonstrate that our proposed DiffQAS-QRL approach achieves performance comparable to manually-crafted circuit architectures across considered environments, showcasing stability across diverse scenarios. This methodology offers a pathway for designing QRL models without extensive quantum knowledge, ensuring robust performance and fostering broader application of QRL.
