DRLQ: A Deep Reinforcement Learning-based Task Placement for Quantum Cloud Computing
Hoa T. Nguyen, Muhammad Usman, Rajkumar Buyya
TL;DR
DRLQ addresses task placement in quantum cloud computing by formulating it as a deep reinforcement learning problem and solving it with Rainbow DQN. The framework models QNodes and QTasks as an MDP, where states encode heterogeneous resource traits, actions assign tasks to nodes, and rewards balance completion time against rescheduling penalties. Using QSimPy with synthetic QTask data from MQTBench, DRLQ shows substantial improvements in makespan and a dramatic reduction in rescheduling compared with heuristics, demonstrating the viability of DRL-based quantum cloud resource management. The work lays a foundation for integrating quantum-specific factors (e.g., error rates, circuit transpilation) in future DRL-based scheduling. $ t^{exec}_{\\theta_i} = \\dfrac{\\theta_i^d \\times \\theta_i^s}{q_j^s} $ and $ t_{\\theta_i} = t^{wait}_{\\theta_i} + t^{exec}_{\\theta_i} $, $ \\Omega(\\Xi) = \\min \\sum_{i=1}^n t_{\\theta_i} $.
Abstract
The quantum cloud computing paradigm presents unique challenges in task placement due to the dynamic and heterogeneous nature of quantum computation resources. Traditional heuristic approaches fall short in adapting to the rapidly evolving landscape of quantum computing. This paper proposes DRLQ, a novel Deep Reinforcement Learning (DRL)-based technique for task placement in quantum cloud computing environments, addressing the optimization of task completion time and quantum task scheduling efficiency. It leverages the Deep Q Network (DQN) architecture, enhanced with the Rainbow DQN approach, to create a dynamic task placement strategy. This approach is one of the first in the field of quantum cloud resource management, enabling adaptive learning and decision-making for quantum cloud environments and effectively optimizing task placement based on changing conditions and resource availability. We conduct extensive experiments using the QSimPy simulation toolkit to evaluate the performance of our method, demonstrating substantial improvements in task execution efficiency and a reduction in the need to reschedule quantum tasks. Our results show that utilizing the DRLQ approach for task placement can significantly reduce total quantum task completion time by 37.81% to 72.93% and prevent task rescheduling attempts compared to other heuristic approaches.
