An Introduction to Quantum Reinforcement Learning (QRL)
Samuel Yen-Chi Chen
TL;DR
This article surveys quantum reinforcement learning (QRL), where variational quantum circuits are employed to represent policy and value functions within hybrid quantum–classical RL pipelines. It surveys a range of approaches, including quantum deep Q-learning, quantum policy gradients, quantum recurrent policies (QLSTM), quantum fast weight programmers, and differentiable quantum architecture search, illustrating how VQCs can replace or augment classical networks in RL setups. The discussion highlights practical challenges in the NISQ era—such as limited qubit counts, noise, data encoding for large observations, and trainability concerns—and describes strategies like tensor-network compressors, reservoir computing, and asynchronous training to mitigate them. The paper concludes that QRL has the potential to improve sample efficiency, scalability, and memory in sequential decision tasks, while outlining open questions and directions for advancing the field.
Abstract
Recent advancements in quantum computing (QC) and machine learning (ML) have sparked considerable interest in the integration of these two cutting-edge fields. Among the various ML techniques, reinforcement learning (RL) stands out for its ability to address complex sequential decision-making problems. RL has already demonstrated substantial success in the classical ML community. Now, the emerging field of Quantum Reinforcement Learning (QRL) seeks to enhance RL algorithms by incorporating principles from quantum computing. This paper offers an introduction to this exciting area for the broader AI and ML community.
