Quantum framework for Reinforcement Learning: Integrating Markov decision process, quantum arithmetic, and trajectory search

Thet Htar Su; Shaswot Shresthamali; Masaaki Kondo

Quantum framework for Reinforcement Learning: Integrating Markov decision process, quantum arithmetic, and trajectory search

Thet Htar Su, Shaswot Shresthamali, Masaaki Kondo

TL;DR

This work addresses the computational bottlenecks of classical reinforcement learning by introducing a fully quantum framework that encodes an MDP in quantum registers and performs agent–environment interactions, return computation, and trajectory search entirely in the quantum domain. The method includes a quantum representation of $S$ and $A$, quantum state transitions via $R_y$ rotations conditioned on state–action pairs, quantum return aggregation, and Grover-based trajectory search to identify high-return paths with a single oracle call. Demonstrations on a four-state, two-action MDP show that the quantum model reproduces classical dynamics and that Grover’s search can recover optimal trajectories in parallel to classical Q-learning results, with indications of improved sample efficiency and speed. The findings suggest a viable path toward quantum-native RL with potential impact on autonomous systems, healthcare, and finance, while highlighting practical challenges in scaling quantum resources and unknown-return search.

Abstract

This paper introduces a quantum framework for addressing reinforcement learning (RL) tasks, grounded in the quantum principles and leveraging a fully quantum model of the classical Markov decision process (MDP). By employing quantum concepts and a quantum search algorithm, this work presents the implementation and optimization of the agent-environment interactions entirely within the quantum domain, eliminating reliance on classical computations. Key contributions include the quantum-based state transitions, return calculation, and trajectory search mechanism that utilize quantum principles to demonstrate the realization of RL processes through quantum phenomena. The implementation emphasizes the fundamental role of quantum superposition in enhancing computational efficiency for RL tasks. Results demonstrate the capacity of a quantum model to achieve quantum enhancement in RL, highlighting the potential of fully quantum implementations in decision-making tasks. This work not only underscores the applicability of quantum computing in machine learning but also contributes to the field of quantum reinforcement learning (QRL) by offering a robust framework for understanding and exploiting quantum computing in RL systems.

Quantum framework for Reinforcement Learning: Integrating Markov decision process, quantum arithmetic, and trajectory search

TL;DR

Abstract

Quantum framework for Reinforcement Learning: Integrating Markov decision process, quantum arithmetic, and trajectory search

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (15)