DRLQ: A Deep Reinforcement Learning-based Task Placement for Quantum Cloud Computing

Hoa T. Nguyen; Muhammad Usman; Rajkumar Buyya

DRLQ: A Deep Reinforcement Learning-based Task Placement for Quantum Cloud Computing

Hoa T. Nguyen, Muhammad Usman, Rajkumar Buyya

TL;DR

DRLQ addresses task placement in quantum cloud computing by formulating it as a deep reinforcement learning problem and solving it with Rainbow DQN. The framework models QNodes and QTasks as an MDP, where states encode heterogeneous resource traits, actions assign tasks to nodes, and rewards balance completion time against rescheduling penalties. Using QSimPy with synthetic QTask data from MQTBench, DRLQ shows substantial improvements in makespan and a dramatic reduction in rescheduling compared with heuristics, demonstrating the viability of DRL-based quantum cloud resource management. The work lays a foundation for integrating quantum-specific factors (e.g., error rates, circuit transpilation) in future DRL-based scheduling. $ t^{exec}_{\\theta_i} = \\dfrac{\\theta_i^d \\times \\theta_i^s}{q_j^s} $ and $ t_{\\theta_i} = t^{wait}_{\\theta_i} + t^{exec}_{\\theta_i} $, $ \\Omega(\\Xi) = \\min \\sum_{i=1}^n t_{\\theta_i} $.

Abstract

The quantum cloud computing paradigm presents unique challenges in task placement due to the dynamic and heterogeneous nature of quantum computation resources. Traditional heuristic approaches fall short in adapting to the rapidly evolving landscape of quantum computing. This paper proposes DRLQ, a novel Deep Reinforcement Learning (DRL)-based technique for task placement in quantum cloud computing environments, addressing the optimization of task completion time and quantum task scheduling efficiency. It leverages the Deep Q Network (DQN) architecture, enhanced with the Rainbow DQN approach, to create a dynamic task placement strategy. This approach is one of the first in the field of quantum cloud resource management, enabling adaptive learning and decision-making for quantum cloud environments and effectively optimizing task placement based on changing conditions and resource availability. We conduct extensive experiments using the QSimPy simulation toolkit to evaluate the performance of our method, demonstrating substantial improvements in task execution efficiency and a reduction in the need to reschedule quantum tasks. Our results show that utilizing the DRLQ approach for task placement can significantly reduce total quantum task completion time by 37.81% to 72.93% and prevent task rescheduling attempts compared to other heuristic approaches.

DRLQ: A Deep Reinforcement Learning-based Task Placement for Quantum Cloud Computing

TL;DR

and

Abstract

Paper Structure (11 sections, 13 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 11 sections, 13 equations, 4 figures, 1 table, 1 algorithm.

Introduction
Related Work
System Model and Problem Formulation
System Model
Problem Formulation
Deep Reinforcement Learning Model
DRLQ Framework
Performance Evaluation
Environment Setup
Evaluation and Discussion
Conclusions and Future Work

Figures (4)

Figure 1: Overview of the system model for the task placement problem in quantum cloud environments
Figure 2: Episode reward means and episode lengths during the training of the DRLQ policy over 100,000 time steps, total training time is 8.34 hours. Each episode consists of 60 random QTasks that arrive randomly within a 1-minute time window. The tuned hyperparameters are set as follows: learning rate (lr) = 0.01, number of atoms = 10, train batch size = 180, n_step = 3, v_min = -10, v_max = 10, penalty ($\Delta$) = -10, and penalty factor ($\alpha$) = 0.1.
Figure 3: Total completion times of all QTask over 100 evaluation episodes among DRLQ and other heuristic approaches.
Figure 4: Average number of task rescheduling attempts after 100 evaluation episodes of DRLQ and other approaches

DRLQ: A Deep Reinforcement Learning-based Task Placement for Quantum Cloud Computing

TL;DR

Abstract

DRLQ: A Deep Reinforcement Learning-based Task Placement for Quantum Cloud Computing

Authors

TL;DR

Abstract

Table of Contents

Figures (4)